Terabyte per second to Byte per second
TBps
Bps
Conversion History
| Conversion | Reuse | Delete |
|---|---|---|
| No conversion history to show. | ||
Quick Reference Table (Terabyte per second to Byte per second)
| Terabyte per second (TBps) | Byte per second (Bps) |
|---|---|
| 0.001 | 1,000,000,000 |
| 0.01 | 10,000,000,000 |
| 0.1 | 100,000,000,000 |
| 1 | 1,000,000,000,000 |
| 3.35 | 3,350,000,000,000 |
| 10 | 10,000,000,000,000 |
About Terabyte per second (TBps)
A terabyte per second (TB/s or TBps) equals 8 terabits per second and represents the bandwidth scale of GPU memory systems, high-performance computing interconnects, and the fastest data center storage fabrics. The HBM3 memory stacks on high-end AI accelerators provide 3–4 TB/s of internal bandwidth. InfiniBand NDR connections used in supercomputers reach 400 Gbps per link, with multiple links aggregated to TB/s totals. At 1 TB/s, the entire contents of a 1 PB data store could transfer in about 17 minutes.
The NVIDIA H100 GPU features 3.35 TB/s of HBM3 memory bandwidth. Top-tier supercomputers like Frontier aggregate over 75 TB/s of storage I/O bandwidth.
About Byte per second (Bps)
A byte per second (B/s or Bps) is the base byte-based unit of data transfer rate, equal to 8 bits per second. While ISPs advertise in bits per second, download managers, operating systems, and file transfer tools display speeds in bytes per second — a direct measure of how quickly usable file data arrives. The conversion between bits and bytes is constant: divide Mbps by 8 to get MB/s. At 1 B/s, transferring a 1 MB file would take about 11.5 days.
An old dial-up connection at 56 kbps delivered roughly 7,000 B/s (7 kB/s) of actual file data. USB 2.0 maxes out at about 60,000,000 B/s (60 MB/s).
Terabyte per second – Frequently Asked Questions
Why do AI chips need TB/s of memory bandwidth?
Large language models have billions of parameters that must be read from memory for every inference pass. An LLM with 70 billion parameters at 16-bit precision needs 140 GB of data read per forward pass. At 3 TB/s, the H100 can perform roughly 20 inference passes per second — bandwidth directly determines tokens-per-second output.
Why is memory bandwidth the main bottleneck for large language model inference?
During LLM inference each token requires reading all model weights from memory. A 70-billion-parameter model at 16-bit precision means 140 GB read per forward pass. At 30 tokens per second, that is 4.2 TB/s of memory reads — right at the limit of an H100's HBM3. This is why AI inference is "memory-bound": the GPU's compute cores sit idle waiting for data. Quantising weights to 8-bit or 4-bit halves or quarters the bandwidth demand, directly increasing tokens per second.
What is the fastest memory bandwidth ever achieved in a commercial chip?
The NVIDIA B200 GPU with HBM3e achieves approximately 8 TB/s of memory bandwidth as of 2025. Each generation roughly doubles bandwidth — from 2 TB/s (A100) to 3.35 TB/s (H100) to 4.8 TB/s (H200) to 8 TB/s (B200). The trajectory suggests 16+ TB/s within a few years.
How long would it take to transfer a petabyte at 1 TB/s?
About 16.7 minutes. A petabyte is 1,000 terabytes, so at 1 TB/s, the math is simple division. For context, the Library of Congress contains roughly 10–20 petabytes of data. Transferring it all at 1 TB/s would take about 3–6 hours.
Is there anything beyond TB/s?
Yes — petabytes per second (PB/s). Experimental optical interconnects and photonic computing architectures are pushing toward PB/s-class bandwidth. Some supercomputer storage systems already aggregate into the PB/s range when all nodes operate simultaneously. It is the next frontier for AI training clusters.
Byte per second – Frequently Asked Questions
Why is a byte the fundamental unit of file storage but not of network speed?
Files are stored in bytes because CPUs address memory in byte-sized (8-bit) chunks — the smallest unit a program can read or write. Networks measure in bits because physical signals on a wire or fiber are serial: one bit at a time, clocked at a specific frequency. A 1 GHz signal produces 1 Gbps, not 1 GBps. The two worlds evolved independently and neither adopted the other's convention, leaving users to divide by 8 forever.
Is a byte always 8 bits?
In modern computing, yes — a byte is universally 8 bits. Historically, some architectures used 6, 7, or 9-bit bytes, which is why the unambiguous term "octet" exists in networking standards. But for all practical bandwidth conversions today, 1 byte = 8 bits.
Why is actual file download speed always less than the connection speed in bytes?
Network protocols add overhead — TCP headers, encryption (TLS), error correction, and packet framing all consume bandwidth without contributing to file data. A 100 Mbps connection might deliver 11 MB/s instead of the theoretical 12.5 MB/s because 10–15% goes to protocol overhead.
How many bytes per second does USB 3.0 actually transfer?
USB 3.0 has a theoretical maximum of 625 MB/s (5 Gbps ÷ 8), but real-world sustained transfers hit 300–400 MB/s due to protocol overhead and controller limitations. USB 3.2 Gen 2 doubles this to about 700–900 MB/s in practice.
What came first — the bit or the byte?
The bit came first, coined by Claude Shannon in 1948. The byte was introduced at IBM in the mid-1950s by Werner Buchholz to describe the smallest addressable group of bits in the IBM Stretch computer. Originally it could be any size; the 8-bit byte became standard with the IBM System/360 in 1964.