NVIDIA A30 Tensor Core GPU

TL;DR

Cut-down GA100 with 24 GB HBM2 at 933 GB/s — A100's memory class at a fraction of the cost.
165 W PCIe card supporting MIG (up to 4 slices), positioned between A10 and A100.
Strong for inference of 13B-class models where HBM bandwidth matters but A100 is over-spec.
Modest deployment relative to A10/A100; effectively niche by 2026.

Overview#

The A30 occupies an unusual slot in the Ampere lineup. Unlike A10 (GA102 with GDDR6), A30 is built on the same GA100 die as A100, but with fewer SMs enabled and only 24 GB of HBM2. The result is a card with HBM-class bandwidth at a price point well below A100.

By 2026 the A30 is mostly seen in pre-existing enterprise inference fleets. New deployments tend to bypass it for L40S (better single-card throughput) or A100/H100 (better $/training-token).

Specifications#

Metric	A30
Architecture	Ampere (GA100)
FP64 (Tensor)	10.3 TFLOPS
FP32	10.3 TFLOPS
TF32 (Tensor, sparse)	165 TFLOPS
BF16 / FP16 (Tensor, sparse)	330 TFLOPS
INT8 (Tensor, sparse)	661 TOPS
Memory	24 GB HBM2
Memory bandwidth	933 GB/s
TDP	165 W
Form factor	PCIe Gen4 x16, dual-slot
NVLink	200 GB/s (bridge, optional)
MIG instances	Up to 4

Why HBM at This Tier#

A30 exists because some inference workloads — long-sequence transformer decode in particular — are bandwidth-bound, not FLOPS-bound. A10's 600 GB/s of GDDR6 limits these workloads; A30's 933 GB/s of HBM2 substantially closes the gap to A100 at a lower price.

Pairing HBM2 with FP32-style compute (10.3 TFLOPS) gives the card an unusually high bandwidth-to-FLOPS ratio. For memory-bound inference shapes this can produce per-watt throughput close to A100 80 GB.

When to Pick A30#

Bandwidth-bound LLM inference where GDDR6 is too slow but A100 is too expensive.
Multi-tenant inference using MIG to host four hardware-isolated slices.
Pre-existing fleets where total cost of ownership has been amortised.
Pick L40S for raw inference throughput on dense models.
Pick A100 / H100 if training is in scope or 24 GB is insufficient.

Pitfalls#

HBM2 (not HBM2e or HBM3) caps bandwidth well below modern parts.
Limited availability — A30 was less popular than A10 / A100 and is harder to procure in 2026.
MIG slices on A30 are smaller (memory-wise) than A100 slices — sizing must be re-validated.
No FP8 support; modern quantised inference paths bypass A30 entirely.

Software Notes#

Standard CUDA 11.x / 12.x / 13 support, full vLLM and TensorRT-LLM compatibility. MIG configuration uses the same nvidia-smi mig commands as A100 with different slice profiles.

References

NVIDIA A30 Datasheet · NVIDIA

Overview#

By 2026 the A30 is mostly seen in pre-existing enterprise inference fleets. New deployments tend to bypass it for L40S (better single-card throughput) or A100/H100 (better $/training-token).

Specifications#

Metric	A30
Architecture	Ampere (GA100)
FP64 (Tensor)	10.3 TFLOPS
FP32	10.3 TFLOPS
TF32 (Tensor, sparse)	165 TFLOPS
BF16 / FP16 (Tensor, sparse)	330 TFLOPS
INT8 (Tensor, sparse)	661 TOPS
Memory	24 GB HBM2
Memory bandwidth	933 GB/s
TDP	165 W
Form factor	PCIe Gen4 x16, dual-slot
NVLink	200 GB/s (bridge, optional)
MIG instances	Up to 4

Why HBM at This Tier#

When to Pick A30#

Bandwidth-bound LLM inference where GDDR6 is too slow but A100 is too expensive.

Multi-tenant inference using MIG to host four hardware-isolated slices.

Pre-existing fleets where total cost of ownership has been amortised.

Pick L40S for raw inference throughput on dense models.

Pick A100 / H100 if training is in scope or 24 GB is insufficient.

Pitfalls#

HBM2 (not HBM2e or HBM3) caps bandwidth well below modern parts.

Limited availability — A30 was less popular than A10 / A100 and is harder to procure in 2026.

MIG slices on A30 are smaller (memory-wise) than A100 slices — sizing must be re-validated.

No FP8 support; modern quantised inference paths bypass A30 entirely.

NVIDIA A30 Tensor Core GPU

Overview#

Specifications#

Why HBM at This Tier#

When to Pick A30#

Pitfalls#

Software Notes#

References

Browse all entries

Deploy on Yobitel

NVIDIA A30 Tensor Core GPU

Overview#

Specifications#

Why HBM at This Tier#

When to Pick A30#

Pitfalls#

Software Notes#

References

Browse all entries

Deploy on Yobitel