Terabyte per second to Gibibyte per second

TBps

1 TBps

GiB/s

931.322574615478515625 GiB/s

Conversion History

ConversionReuseDelete
No conversion history to show.

Entries per page:

0–0 of 0


Quick Reference Table (Terabyte per second to Gibibyte per second)

Terabyte per second (TBps)Gibibyte per second (GiB/s)
0.0010.93132257461547851563
0.019.31322574615478515625
0.193.1322574615478515625
1931.322574615478515625
3.353,119.93062496185302734375
109,313.22574615478515625

About Terabyte per second (TBps)

A terabyte per second (TB/s or TBps) equals 8 terabits per second and represents the bandwidth scale of GPU memory systems, high-performance computing interconnects, and the fastest data center storage fabrics. The HBM3 memory stacks on high-end AI accelerators provide 3–4 TB/s of internal bandwidth. InfiniBand NDR connections used in supercomputers reach 400 Gbps per link, with multiple links aggregated to TB/s totals. At 1 TB/s, the entire contents of a 1 PB data store could transfer in about 17 minutes.

The NVIDIA H100 GPU features 3.35 TB/s of HBM3 memory bandwidth. Top-tier supercomputers like Frontier aggregate over 75 TB/s of storage I/O bandwidth.

About Gibibyte per second (GiB/s)

A gibibyte per second (GiB/s) equals 1,073,741,824 bytes per second and is used in high-performance storage and memory bandwidth measurements when binary precision is required. GPU memory bandwidth figures in technical documentation sometimes appear in GiB/s — an NVIDIA RTX 4090 features 1,008 GiB/s of GDDR6X memory bandwidth. NVMe SSD sequential read speeds are often reported as both GB/s (decimal) and GiB/s (binary) in reviews and datasheets.

The NVIDIA RTX 4090 GPU has 1,008 GiB/s of memory bandwidth (~1,082 GB/s in decimal). DDR5-6400 dual-channel memory provides about 100 GiB/s.


Terabyte per second – Frequently Asked Questions

Large language models have billions of parameters that must be read from memory for every inference pass. An LLM with 70 billion parameters at 16-bit precision needs 140 GB of data read per forward pass. At 3 TB/s, the H100 can perform roughly 20 inference passes per second — bandwidth directly determines tokens-per-second output.

During LLM inference each token requires reading all model weights from memory. A 70-billion-parameter model at 16-bit precision means 140 GB read per forward pass. At 30 tokens per second, that is 4.2 TB/s of memory reads — right at the limit of an H100's HBM3. This is why AI inference is "memory-bound": the GPU's compute cores sit idle waiting for data. Quantising weights to 8-bit or 4-bit halves or quarters the bandwidth demand, directly increasing tokens per second.

The NVIDIA B200 GPU with HBM3e achieves approximately 8 TB/s of memory bandwidth as of 2025. Each generation roughly doubles bandwidth — from 2 TB/s (A100) to 3.35 TB/s (H100) to 4.8 TB/s (H200) to 8 TB/s (B200). The trajectory suggests 16+ TB/s within a few years.

About 16.7 minutes. A petabyte is 1,000 terabytes, so at 1 TB/s, the math is simple division. For context, the Library of Congress contains roughly 10–20 petabytes of data. Transferring it all at 1 TB/s would take about 3–6 hours.

Yes — petabytes per second (PB/s). Experimental optical interconnects and photonic computing architectures are pushing toward PB/s-class bandwidth. Some supercomputer storage systems already aggregate into the PB/s range when all nodes operate simultaneously. It is the next frontier for AI training clusters.

Gibibyte per second – Frequently Asked Questions

GPU memory is addressed in binary (power-of-2 bus widths like 256-bit or 384-bit), so binary units naturally describe the actual hardware capability. Some vendors use GiB/s to be precise, while marketing materials prefer the larger-sounding GB/s number. The RTX 4090's 1,008 GiB/s is 1,082 GB/s — the latter sounds faster.

DDR5-6000 in dual-channel mode provides about 93 GiB/s (100 GB/s). Quad-channel DDR5 on workstation platforms doubles this to ~186 GiB/s. The actual usable bandwidth depends on memory access patterns — random access achieves far less than sequential streaming.

Memory bandwidth (50–100+ GiB/s for DDR5) measures how fast the CPU can read/write RAM. Storage bandwidth (3–14 GiB/s for NVMe SSDs) measures persistent data transfer. Memory is 10–30× faster because DRAM has nanosecond latency while NAND flash has microsecond latency. They serve different roles in the data hierarchy.

Yes. For memory bandwidth, run a STREAM benchmark (available for Linux and Windows). For storage, use fio or CrystalDiskMark. GPU memory bandwidth can be tested with gpu-burn or vendor-provided tools. All will report in either GiB/s or GB/s depending on the tool — check which one.

Electrical signalling on copper traces maxes out around 112 Gbps (about 13 GiB/s) per lane with current technology. Beyond that, optics take over — silicon photonics interconnects can push individual channels to 200+ Gbps. The physical speed of light in fiber is not the limit; it is the modulation and detection electronics.

© 2026 TopConverters.com. All rights reserved.