Nvidia DGX Spark: a mini supercomputer with GB10 Grace Blackwell architecture – can it replace an entire data center ?

Slávka

1 month ago

Artificial intelligence is no longer limited to data centres and research institutions. Nvidia is pushing the boundaries of performance even further with DGX Spark – a miniature AI supercomputer that brings server-class performance right to the desktop. Built on the new GB10 Grace Blackwell chip, it combines the computing power of a GPU and CPU in one integrated system. The goal of this device is to make AI development and training accessible not only to large corporations, but also to smaller teams, researchers and startups.

What is Nvidia DGX Spark and why was it created

DGX Spark extends the DGX series of workstations familiar from supercomputer and data center environments. After years of dominance by the DGX H100 and DGX B200, here comes a solution that focuses on local use and smaller work environments. It offers data center performance in a compact form factor, ideal for labs, universities or development teams focused on AI projects.

The project, originally known as Digits, has been renamed DGX Spark and goes on sale October 15, 2025. Nvidia has announced partnerships with manufacturers such as ASUS, Acer, Dell, HP, Lenovo and MSI who will bring their own OEM variants built on the same foundation. Pricing ranges between $3,780 – $4,860 depending on region, with a slightly higher price expected in Europe due to VAT and import costs.

Nvidia’s goal, according to an official statement, is to “enable every developer to put AI computing right on their desk.” Nvidia DGX Spark thus follows the philosophy of decentralising AI – moving computation from huge clouds closer to the user, which shortens response times, reduces costs and brings greater control over data.

Nvidia DGX Spark architecture with GB10 Grace Blackwell

At the heart of DGX Spark is the exceptional GB10 Grace Blackwell Superchip, which combines the best of two worlds – the power of Grace CPUs (ARM v9) and Blackwell GPU graphics cores . Nvidia rightly refers to it as a superchip because the CPU and GPU are integrated into a single silicon module and share a common memory in a coherent unified access mode.

Inside are 20 ARM v9 cores, divided into ten powerful Cortex-X925 and ten power-efficient Cortex-A725. They work with the new generation of Tensor Cores 5.0, which are designed specifically for artificial intelligence, machine learning and large model processing. The chip is fabricated with TSMC’ s advanced 4nm process and is among the technological leading edge of compact AI solutions.

The biggest draw is its performance. By supporting the FP4 format with sparsity, it can achieve up to 1 PetaFLOP (1,000 AI TOPS) – a performance that was reserved only for huge supercomputers just a few years ago. For classical FP32 computing, it offers around 30 to 35 TFLOPS, but with significantly better power efficiency than most gaming graphics.

In other words, Nvidia DGX Spark can train and run large-scale language models with tens of billions of parameters right at your desk – no cloud and no servers. What once required an entire data center now fits in a device the size of a lunchbox.

*A distributed Nvidia DGX Spark with the GB10 Grace Blackwell chip. Shows the integration of the Blackwell GPU, Grace CPU, 128GB of LPDDR5X memory, and 4TB of SSD storage.*

Memory, throughput and storage

The DGX Spark comes with extremely generous memory – 128 GB LPDDR5X ECC shared by both the CPU and graphics cores. Instead of two separate blocks, the entire system operates in a single memory space where data moves instantly and without unnecessary copying. This approach dramatically reduces response time and increases the fluency of processing large data sets – exactly what is critical in training language models and machine learning.

The 256-bit wide memory bus achieves a throughput of 273 GB/s. That may not sound huge compared to GDDR7 gaming graphics, but in the context of a unified architecture, it’s an extremely efficient solution. What’s more, Nvidia has added intelligent memory management – compression, prefetching and data stream optimization – to make sure every byte is used to its full potential.

Storage is handled by fast NVMe SSDs (PCIe 4.0) with capacities ranging from 1TB to 4TB. Higher configurations allow complete models and training datasets to be stored directly in the device, without the need for external drives or the cloud. The result is a system that responds immediately and without delay.

Nvidia DGX Spark cooling, power consumption and dimensions

With such a small but powerful device, cooling must be thought out down to the last detail. The Nvidia DGX Spark achieves a TDP of only 140 W, which is negligible compared to server solutions, but still requires precise thermal management. Stable operation is taken care of by a dual vapor chamber system that removes heat directly from the GB10 Grace Blackwell chip. This is complemented by a pair of 3.35 inch silent bearing fans that disperse heat evenly across the entire chassis surface.

The device uses a graphite heat core and aluminum finning, so it maintains low temperatures and quiet operation even at full power. The air in the device flows smoothly without turbulence and vibration, which increases the life of the components.

The body itself is 5.9 × 5.9 × 2.0 inch and weighs approximately 1.2 kg. The compact proportions allow DGX Spark to be placed virtually anywhere – on a desk, shelf or in a rack. Power is provided by an external 19 V adapter, so there is no extra heat source left inside.

Even under prolonged load, the device remains surprisingly quiet – the noise level does not exceed 30 dB, making it one of the quietest AI solutions on the market.

Software ecosystem and AI tools

Behind the performance of the Nvidia DGX Spark is not only hardware, but also a sophisticated software system. The device runs on DGX OS (built on Ubuntu) with an integrated AI Enterprise stack that offers a complete environment for AI development, training and deployment.

As a basic package, Nvidia DGX Spark offers AI tools ready for immediate use:

CUDA 12, cuDNN and TensorRT – optimized libraries for GPU computing and maximizing the power of the GB10 Grace Blackwell chip.
NeMo Framework – a framework for developing and training custom language and generative models.
AI Workbench and Omniverse Blueprints – tools to easily build, visualize and test AI projects.
NeMo Microservices – modular AI services for rapid deployment of models into practice.
TruLens for LLM Monitoring – a tool for monitoring the accuracy, stability and quality of generative models.

This combination makes DGX Spark ready not only for model training, but also for practical deployment – from experiments in the lab to real AI projects. The device also supports the Nvidia Cloud API, making it possible to combine local and cloud computing to create efficient edge AI solutions.

Connectivity and performance scaling

The Nvidia DGX Spark can easily scale its performance thanks to the integrated Mellanox ConnectX-7 Smart NIC network controller. Two units can connect to form a shared AI cluster that together can process models with more than 400 billion parameters. The interconnection does not require complex setup – the devices automatically synchronize and distribute the computations evenly.

Connectivity is one of Nvidia DGX Spark’s strengths and enables easy integration into existing infrastructure:

2× QSFP ports (InfiniBand/Ethernet) – for high-speed interconnection of multiple units or servers in an AI cluster.
1× 10 GbE RJ-45 – for regular network connectivity and data transfer in the local network.
USB-C and USB-A ports – universal interfaces for peripherals and extensions.
Optional Wi-Fi 6E and Bluetooth 5.3 – for wireless connectivity and edge device testing.

*Two Nvidia DGX Spark units connected in an AI cluster. Visible HDMI, RJ-45 and QSFP ports on the rear panel for high-speed connectivity.*

The ability to scale performance without complexity and excessive heat makes DGX Spark one of the most flexible AI solutions on the market – combining the simplicity of a workstation with the power of a miniature supercomputer.

Technical Specifications – Nvidia DGX Spark

Nvidia DGX Spark’s detailed specifications show how performance, GB10 Grace Blackwell architecture and efficient cooling combine into one compact AI system. This configuration reveals both the technical limits and strengths that make DGX Spark a personal supercomputer for AI development.

Specification	Value
Device Type	Personal AI supercomputer / AI workstation
Architecture	Nvidia GB10 Grace Blackwell Superchip
CPU	20 ARM v9 cores (10 × Cortex-X925 10 × Cortex-A725)
GPU	Blackwell GPU (5th Gen Tensor Cores)
Manufacturing process	4 nm TSMC (4Node)
Computing power (AI)	up to 1 PFLOP (FP4 sparsity / 1000 AI TOPS)
Performance (FP32)	approximately 30 – 35 TFLOPS
Memory (RAM)	128 GB LPDDR5X ECC unified memory
Memory bus	256-bit
Memory throughput	273 GB/s
Storage	NVMe M.2 (PCIe 4.0) \| 1 TB – 4 TB
Clustering support	Yes – max 2 devices (ConnectX-7 link)
AI size modelssupported	up to 200 billion (1 drive) \| 400 billion (2 drives)
Network interfaces	Mellanox ConnectX-7 Smart NIC 10 GbE RJ-45
Operating System	Nvidia DGX OS (Ubuntu AI Enterprise Stack)
Software / Ecosystem	CUDA 12, cuDNN \| TensorRT \| NeMo \| AI Workbench \| Omniverse Blueprints
Power consumption (TDP)	140 W
Power	External adapter (19 V DC)
Cooling	Dual vapor chamber 85 mm fans
Dimensions / weight	5.9 × 5.9 × 2.0 inch / 1.2 kg
Availability	15. October 2025
Manufacturers	Nvidia \| ASUS \| Acer \| Dell \| HP \| MSI \| Lenovo

*Render of the Nvidia DGX Spark device next to the GB10 Grace Blackwell compute module. It shows the compact size of the system and its internal chip module.*

Benefits

Nvidia DGX Spark excels in efficiency, compact design and tight integration with Nvidia software, making it an ideal solution for AI development.

Outstanding efficiency per watt – 1 PFLOP of performance while consuming only 140 W (approx. 7 TOPS/W).
Compact and quiet design – also suitable for offices, schools or small labs.
Easy integration – two units can be connected via ConnectX-7 Smart NIC.
Full compatibility with the Nvidia software ecosystem – CUDA, TensorRT, NeMo, AI Workbench.
Shared CPU GPU memory – speeds up computation and reduces latency in AI models.
Hybrid option with cloud – flexible performance as needed.

Cons

Despite its superior performance, Nvidia DGX Spark also has limitations – higher price, smaller memory, and limited clustering capabilities.

Limited clustering – officially only up to two drives.
High price – approximately $4,000 may be out of reach for smaller teams.
Limited memory capacity (128 GB) – may be insufficient for extremely large models.
Lower memory throughput (273 GB/s) – compared to GDDR7 solutions in desktop GPUs.

Conclusion

Nvidia DGX Spark represents a new type of device that combines the performance of a supercomputer with the accessibility of a personal computer. It brings AI computing from clouds to desktops, enables training models without external servers, and defines a new category – the personal AI workstation.

The combination of GB10 Grace Blackwell architecture, 128GB of shared memory, low power consumption and high performance makes DGX Spark a powerful tool for the future of AI. If Nvidia manages to keep the price stable and expand the ecosystem, this little black block could be as iconic a milestone as the first DGX-1 was all those years ago.

Browse all NVIDIA products on Amazon