Tenstorrent Grayskull

TL;DR

Tenstorrent's first-generation AI processor — array of Tensix cores with RISC-V control.
Targeted at edge and developer workloads rather than data centre frontier inference.
Superseded by the larger Wormhole and upcoming Blackhole generations.
Open software stack (TT-Metalium, TT-NN) and an explicit commitment to RISC-V make Tenstorrent distinctive among AI silicon vendors.

Overview#

Grayskull is Tenstorrent's first-generation AI processor — an array of Tensix cores connected by an on-chip Network-on-Chip and orchestrated by RISC-V processors. The company's broader pitch — RISC-V-controlled dataflow with an open software stack — starts here, though commercial deployments concentrated on developer hardware rather than data centre racks.

Grayskull is largely superseded by Tenstorrent's later Wormhole and Blackhole products for production inference. It remains relevant as the foundation of Tenstorrent's open developer ecosystem and as the introduction to TT-Metalium and TT-NN.

Specifications#

Metric	Grayskull e150 (developer)
Architecture	Tensix dataflow + RISC-V
Tensix cores	120
INT8	~315 TOPS
FP16	~92 TFLOPS
External memory	8 GB LPDDR4
TDP	75 W (single-slot PCIe)
Form factor	PCIe Gen4 x16, single-slot

Architecture Notes#

Each Tensix core combines a matrix engine, a vector unit, and five small RISC-V processors used for control and data movement. The chip behaves as a 2D mesh of cooperating dataflow units, with the compiler explicit about data movement across the NoC.

Programming targets TT-Metalium (low-level kernel and host APIs) and TT-NN (a higher-level framework that mirrors PyTorch nn modules). Both are open-source — unusual among AI accelerator vendors and a deliberate part of Tenstorrent's positioning.

When Grayskull Makes Sense#

Developer hardware for teams exploring Tenstorrent's stack ahead of larger-scale deployment.
Edge AI applications where the 75 W envelope and PCIe single-slot form factor matter.
Open-stack experimentation where TT-Metalium's RISC-V control flow is a research target.
For production inference of modern LLMs — Wormhole, Blackhole and competitor GPUs are the right comparison points.

Pitfalls#

Limited HBM-class memory bandwidth — LPDDR4 is the constraint for LLM decode.
Software ecosystem is small; most ML libraries are not Tensix-aware out of the box.
Long-term Tenstorrent roadmap focuses on Wormhole and Blackhole; Grayskull's investment intensity will continue to taper.

Software Notes#

TT-Metalium (kernel-level, open source) and TT-NN (PyTorch-like, open source) are the supported paths. TT-Mellium provides documentation, examples and reference models including Llama-class inference at modest scale.

References

Tenstorrent Grayskull Product Information · Tenstorrent
TT-Metalium Documentation · Tenstorrent

Overview#

Metric

Grayskull e150 (developer)

Architecture

Tensix dataflow + RISC-V

Tensix cores

120

INT8

~315 TOPS

FP16

~92 TFLOPS

External memory

8 GB LPDDR4

TDP

75 W (single-slot PCIe)

Form factor

PCIe Gen4 x16, single-slot

Architecture Notes#

When Grayskull Makes Sense#

Developer hardware for teams exploring Tenstorrent's stack ahead of larger-scale deployment.

Edge AI applications where the 75 W envelope and PCIe single-slot form factor matter.

Open-stack experimentation where TT-Metalium's RISC-V control flow is a research target.

For production inference of modern LLMs — Wormhole, Blackhole and competitor GPUs are the right comparison points.

Tenstorrent Grayskull

Overview#

Specifications#

Architecture Notes#

When Grayskull Makes Sense#

Pitfalls#

Software Notes#

References

Browse all entries

Deploy on Yobitel

Tenstorrent Grayskull

Overview#

Specifications#

Architecture Notes#

When Grayskull Makes Sense#

Pitfalls#

Software Notes#

References

Browse all entries

Deploy on Yobitel