Select your [[ build.model.nick ]] model

Choose the price, not the parts. Each model is built with the GPUs, CPU, RAM, and storage that maximizes Deep Learning performance per dollar.

[[ build.model.image_alt ]]

Select your [[ build.model.nick ]] model

Choose the price, not the parts. Each model is built with the GPUs, CPU, RAM, and storage that maximizes Deep Learning performance per dollar.

Basic

OS Ubuntu 16.04
GPUs 8x NVIDIA 1080 Ti
CPU 2x Intel Xeon E5-2650 v4
RAM 128 GB DDR4 ECC
STORAGE 1 TB SATA SSD (OS install)
EXTRA 2 TB HDD
NETWORK 10 Gbps ethernet

Premium

OS Ubuntu 16.04
GPUs 8x NVIDIA 1080 Ti
CPU 2x Intel Xeon E5-2650 v4
RAM 256 GB DDR4 ECC
STORAGE 2 TB SATA SSD (OS install)
EXTRA 4 TB RAID 5 array (3x HDDs)
NETWORK 10 Gbps ethernet

Max

OS Ubuntu 16.04
GPUs 8x NVIDIA Titan V
CPU 2x Intel Xeon E5-2650 v4
RAM 256 GB DDR4 ECC
STORAGE 2 TB NVME SSD (OS Install)
EXTRA 4 TB RAID 5 array (3x HDDs)
NETWORK 10 Gbps ethernet

Customize

Not seeing what you want?

Add a Protection Plan

3-year protection for [[ getWarrantyPrice(true) ]]

[[ getSubtotal(true)]]
Talk to an engineer
(650) 479-5530

Who bought a Basic?

About the Basic

GPUs

GPUs are the most critical piece of hardware for Deep Learning. The Basic has 8x NVIDIA GTX 1080 Ti GPUs (Pascal Architecture). For Deep Learning in 2018, the 1080 Ti offers the best price/performance trade-off of any GPU on the market. Each 1080 Ti has 11.3 TFLOPs of FP32 performance (the standard precision for Deep Learning training). For most tasks, the 1080 Ti is 95% as fast as the NVIDIA Titan Xp and 70% as fast as the NVIDIA Titan V.

Processor

During training, the CPUs preprocess data and feed it to the GPUs. Slow processors will cause the GPUs to waste cycles waiting for this data. Core count and PCI-e lane count are important CPU performance factors. More cores means faster data preprocessing; more PCI-e lanes means faster transmission of that data to the GPUs. The Basic has two Intel Xeon E5-2650 v4 (12 cores, 40x PCI-e lanes, each). Its core-to-GPU ratio is 3, which follows the best practice of at least 1 CPU core per GPU. The Basic's CPUs, combined with its PLX-enabled motherboard, provide 16x PCI-e lanes to each GPU (the max possible).

Motherboard

A motherboard's PCI-e topology significantly impacts Deep Learning performance. PCI-e lanes are data pipes that enable communication amongst the GPUs and CPU. The number of PCI-e lanes attached to a given device can range from 1 to 16. More lanes is better: for example, a device with 16 PCI-e lanes can send data faster than a device with 4. When training a neural net, the GPUs and CPU send huge amounts of data to each other. To ensure speedy communication, the Basic's motherboard provides each GPU with 16x PCI-e lanes, which is the highest of any motherboard as of 2018.

RAM

A Deep Learning computer should have at least as much RAM as GPU memory. For example, a machine with 2x NVIDIA 1080 Ti GPUs should have at least 22 GB of memory (1080 Tis have 11 GB of memory each). The Basic has 8x 1080 Ti GPUs and 128 GB of DDR4 2400 MHz memory, so it follows this rule of thumb. If you work with large data sets (e.g. many large images), consider upgrading to Premium, which has 256 GB of memory.

Storage

A large data set will not completely fit into RAM; some data must remain on storage. During training, data will be repeatedly loaded from storage to RAM. If the storage is slow, the GPUs will waste cycles waiting for data. The Basic has two storage devices: a 1 TB SSD (fast) and a 2 TB HDD (slower). This way, you can keep the current training data on the SSD and move the rest to the HDD. When you're ready to train on different data, just move it to the SSD.

Network

The Basic has 10 Gbps ethernet. You're ISP will almost certainly be the bottleneck. The main benefit of 10 Gbps ethernet (as opposed to the standard 1 Gbps) is fast file transfers between the computers your network. Multi-node distributed training requires at least 40 Gbps (Infiniband territory).

Who bought a Premium?

About the Premium

GPUs

GPUs are the most critical piece of hardware for Deep Learning. The Premium has 8x NVIDIA GTX 1080 Ti GPUs (Pascal Architecture). For Deep Learning in 2018, the 1080 Ti offers the best price/performance trade-off of any GPU on the market. Each 1080 Ti has 11.3 TFLOPs of FP32 performance (the standard precision for Deep Learning training). For most tasks, the 1080 Ti is 95% as fast as the NVIDIA Titan Xp and 70% as fast as the NVIDIA Titan V.

Processor

During training, the CPUs preprocess data and feed it to the GPUs. Slow processors will cause the GPUs to waste cycles waiting for this data. Core count and PCI-e lane count are important CPU performance factors. More cores means faster data preprocessing; more PCI-e lanes means faster transmission of that data to the GPUs. The Premium has two Intel Xeon E5-2650 v4 (12 cores, 40x PCI-e lanes, each). Its core-to-GPU ratio is 3, which follows the best practice of at least 1 CPU core per GPU. The Premium's CPUs, combined with its PLX-enabled motherboard, provide 16x PCI-e lanes to each GPU (the max possible).

Motherboard

A motherboard's PCI-e topology significantly impacts Deep Learning performance. PCI-e lanes are data pipes that enable communication amongst the GPUs and CPU. The number of PCI-e lanes attached to a given device can range from 1 to 16. More lanes is better: for example, a device with 16 PCI-e lanes can send data faster than a device with 4. When training a neural net, the GPUs and CPU send huge amounts of data to each other. To ensure speedy communication, the Premium's motherboard provides each GPU with 16x PCI-e lanes, which is the highest of any motherboard as of 2018.

RAM

A Deep Learning computer should have at least as much RAM as GPU memory. For example, a machine with 2x NVIDIA 1080 Ti GPUs should have at least 22 GB of memory (1080 Tis have 11 GB of memory each). The Premium has 8x 1080 Ti GPUs and 256 GB of DDR4 2400 MHz memory, so it follows this rule of thumb. If you work with large data sets (e.g. many large images), 256 GB of memory is standard.

Storage

A large data set will not completely fit into RAM; some data must remain on storage. During training, data will be repeatedly loaded from storage to RAM. If the storage is slow, the GPUs will waste cycles waiting for data. The Premium has two storage devices: a 2 TB SSD on which the OS is installed and a 4 TB RAID 5 array (3x 2 TB HDDs). RAID 5 provides an excellent trade-off between performance and data security. Specifically, it provides 2x the read speed of an individual HDD and has a 1-drive failure fault tolerance.

Network

The Premium has 10 Gbps ethernet. You're ISP will almost certainly be the bottleneck. The main benefit of 10 Gbps ethernet (as opposed to the standard 1 Gbps) is fast file transfers between the computers your network. Multi-node distributed training requires at least 40 Gbps (Infiniband territory).

Who bought a Max?

About the Max

GPUs

GPUs are the most critical piece of hardware for Deep Learning. The Max has 8x NVIDIA Titan V GPUs (Volta Architecture). The Titan V is powered by the same chip as the NVIDIA Tesla V100. Each Titan V has 13.8 TFLOPs of FP32 performance, the standard precision for Deep Learning training. For most tasks, the Titan V is about 42% faster than the 1080 Ti and 40% faster than the Titan Xp.

Processor

During training, the CPUs preprocess data and feed it to the GPUs. Slow processors will cause the GPUs to waste cycles waiting for this data. Core count and PCI-e lane count are important CPU performance factors. More cores means faster data preprocessing; more PCI-e lanes means faster transmission of that data to the GPUs. The Max has two Intel Xeon E5-2650 v4 (12 cores, 40x PCI-e lanes, each). Its core-to-GPU ratio is 3, which follows the best practice of at least 1 CPU core per GPU. The Max's CPUs, combined with its PLX-enabled motherboard, provide 16x PCI-e lanes to each GPU (the max possible).

Motherboard

A motherboard's PCI-e topology significantly impacts Deep Learning performance. PCI-e lanes are data pipes that enable communication amongst the GPUs and CPU. The number of PCI-e lanes attached to a given device can range from 1 to 16. A device with 16 PCI-e lanes can send data faster than a device with 4. When training a neural net, the GPUs and CPU send huge amounts of data to each other. To ensure speedy communication, the Max's motherboard provides each GPU with 16x PCI-e lanes, which is the highest of any motherboard as of 2018.

RAM

A Deep Learning computer should have at least as much RAM as GPU memory. For example, a machine with 2x NVIDIA 1080 Ti GPUs should have at least 22 GB of memory (1080 Tis have 11 GB of memory each). The Max has 8x NVIDIA Titan V GPUs and 256 GB of DDR4 2400 MHz memory, so it follows this rule of thumb. If you work with large data sets (e.g. many large images), 256 GB of memory is standard.

Storage

A large data set will not completely fit into RAM; some data must remain on storage. During training, data will be repeatedly loaded from storage to RAM. If the storage is slow, the GPUs will waste cycles waiting for data. The Max has two storage devices: a 2 TB NVME SSD (Intel SSD DC P4600) on which the OS is installed and a 4 TB RAID 5 array (3x 2 TB HDDs). RAID 5 provides an excellent trade-off between performance and data security. Specifically, it provides 2x the read speed of an individual HDD and has a 1-drive failure fault tolerance.

Network

The Max has 10 Gbps ethernet. You're ISP will almost certainly be the bottleneck. The main benefit of 10 Gbps ethernet (as opposed to the standard 1 Gbps) is fast file transfers between the computers your network. Multi-node distributed training requires at least 40 Gbps (Infiniband territory).

[[ component.name ]]

[[ option.description ]] [[ build.getPriceDiff(component, option)]]