Tebibyte per second to Terabyte per second

TiB/s

1 TiB/s

TBps

1.099511627776 TBps

Conversion History

ConversionReuseDelete

1 TiB/s (Tebibyte per second) → 1.099511627776 TBps (Terabyte per second)

Just now

Entries per page:

1–1 of 1


Quick Reference Table (Tebibyte per second to Terabyte per second)

Tebibyte per second (TiB/s)Terabyte per second (TBps)
0.0010.001099511627776
0.010.01099511627776
0.10.1099511627776
11.099511627776
4.85.2776558133248
1010.99511627776

About Tebibyte per second (TiB/s)

A tebibyte per second (TiB/s) equals 1,099,511,627,776 bytes per second and represents the bandwidth scale of cutting-edge AI accelerator memory and high-performance computing interconnects. The HBM3e memory on NVIDIA H200 GPUs provides approximately 4.8 TiB/s of bandwidth. At this scale, the 10% difference between tebibytes (binary) and terabytes (decimal) matters in system design — a buffer sized for 1 TiB/s must handle 1,099 GB/s in decimal bandwidth.

NVIDIA H200 SXM features 4.8 TiB/s of HBM3e memory bandwidth. Top-end AI training clusters aggregate several TiB/s of storage I/O.

About Terabyte per second (TBps)

A terabyte per second (TB/s or TBps) equals 8 terabits per second and represents the bandwidth scale of GPU memory systems, high-performance computing interconnects, and the fastest data center storage fabrics. The HBM3 memory stacks on high-end AI accelerators provide 3–4 TB/s of internal bandwidth. InfiniBand NDR connections used in supercomputers reach 400 Gbps per link, with multiple links aggregated to TB/s totals. At 1 TB/s, the entire contents of a 1 PB data store could transfer in about 17 minutes.

The NVIDIA H100 GPU features 3.35 TB/s of HBM3 memory bandwidth. Top-tier supercomputers like Frontier aggregate over 75 TB/s of storage I/O bandwidth.


Tebibyte per second – Frequently Asked Questions

AMD's MI300X stacks 8 HBM3 memory modules and multiple compute chiplets on a single package using advanced 2.5D packaging with silicon interposers. The short physical distance between compute and memory dies — millimeters instead of centimeters — dramatically reduces signal latency and power per bit. This allows a 5.3 TB/s aggregate bandwidth that would be physically impossible with traditional socketed memory. The trend toward chiplet packaging is how the industry keeps scaling bandwidth despite hitting limits in single-die manufacturing.

Significantly. When provisioning an AI training cluster with hundreds of GPUs, a 10% bandwidth miscalculation cascades through the entire system design — buffer sizes, interconnect capacity, cooling, and power. Getting the units wrong could mean the difference between a training run finishing in 30 days vs 33 days.

Training large language models (100B+ parameters), molecular dynamics simulations, weather modeling, and fluid dynamics at scale. These workloads move enormous matrices through memory billions of times. The TiB/s memory bandwidth of modern GPUs is what makes training models like GPT-4 possible in months rather than decades.

Memory bandwidth dwarfs network bandwidth. Each H100 GPU has 3.35 TiB/s of internal memory bandwidth but connects to the network at only 0.05 TiB/s (400 Gbps InfiniBand). This 60:1 ratio is why AI chip designers obsess over keeping computations local to each GPU and minimising network communication.

Not in the same way. Quantum computers process information through qubits that exist in superposition, so they do not shuttle classical data around at TiB/s. However, the classical control systems that manage quantum processors and process measurement results do need high bandwidth — current quantum-classical interfaces operate at modest Gbps rates.

Terabyte per second – Frequently Asked Questions

Large language models have billions of parameters that must be read from memory for every inference pass. An LLM with 70 billion parameters at 16-bit precision needs 140 GB of data read per forward pass. At 3 TB/s, the H100 can perform roughly 20 inference passes per second — bandwidth directly determines tokens-per-second output.

During LLM inference each token requires reading all model weights from memory. A 70-billion-parameter model at 16-bit precision means 140 GB read per forward pass. At 30 tokens per second, that is 4.2 TB/s of memory reads — right at the limit of an H100's HBM3. This is why AI inference is "memory-bound": the GPU's compute cores sit idle waiting for data. Quantising weights to 8-bit or 4-bit halves or quarters the bandwidth demand, directly increasing tokens per second.

The NVIDIA B200 GPU with HBM3e achieves approximately 8 TB/s of memory bandwidth as of 2025. Each generation roughly doubles bandwidth — from 2 TB/s (A100) to 3.35 TB/s (H100) to 4.8 TB/s (H200) to 8 TB/s (B200). The trajectory suggests 16+ TB/s within a few years.

About 16.7 minutes. A petabyte is 1,000 terabytes, so at 1 TB/s, the math is simple division. For context, the Library of Congress contains roughly 10–20 petabytes of data. Transferring it all at 1 TB/s would take about 3–6 hours.

Yes — petabytes per second (PB/s). Experimental optical interconnects and photonic computing architectures are pushing toward PB/s-class bandwidth. Some supercomputer storage systems already aggregate into the PB/s range when all nodes operate simultaneously. It is the next frontier for AI training clusters.

© 2026 TopConverters.com. All rights reserved.