Ampere A100 - A Leap in Data Center GPUs
Mark Bauer 06/12/2020
Amongst the leaders in computing and hardware is NVIDIA, most known for its widely adopted gaming graphics cards that stay tried and true by a majority of high-end gaming systems. Although gaming touts most of the organization’s fame, NVIDIA also boasts powerful data center hardware that continues to expand the capabilities of processes like deep learning to power client activities such as seeing gravity in real time. Tasks of this nature require extreme super-computing power not only to see it through, but to finish in a timely manner.
This is a sentiment that NVIDIA understands to the upmost degree as the Ampere A100 Technology generation of data center hardware was recently announced, commanding astronomical levels of computational power. NVIDIA sums up their breakthrough as follows:
“Can efficiently scale to thousands of GPUs, with NVIDIA multi-instance GPU technology, or be partitioned into seven GPU instances to accelerate workloads of all sizes. Third generation’s tensor cores accelerate every precision for diverse workloads, speeding time to insight and time to market.”
So, what does that actually mean? There were a few key words in there that are worth delving into, and those are Tensor Cores and Multi Instance GPU’s.
Improved Computing and Tensor Cores (It’s fast)
The previous generation of tensor cores known as the Turing Architecture was a massive breakthrough as it cut process times in half in many of the projects it was fed. The new Ampere architecture takes a massive leap forward from Turing and more than doubles high performance computing capabilities, along with a 6 and 7 fold increase to deep learning training and inference, respectively.
On a more basic level, artificial intelligence and deep learning processes are super complex and have two stages to carry out; training and inference. The training segment is facilitated by the user, as the machine is given examples of inputs and desired outputs. After training, the machine is left to performance inference, during which it uses its “training” in the previous step to process unfamiliar data. As stated above, these processes are now boast 6 and 7 times the performance, potentially cutting task completion from weeks to days, days to hours, and so on…
Previously, many instances were unavailable, as multiple tasks taking place on the same GPU would inevitably compete for the same amount of resources. Ampere kicks the ability to partition into overdrive with seven separate instances each capable of running their own set of tasks without syphoning resources from another instance.
This is pivotal, as workloads can be easily optimized to the number of instances they require while simultaneously allowing more users to access a single GPU. Think of a family of 7 only having a Greyhound bus as their method of transportation. With Ampere, the family is instantly given 7 different cars so they can each go where they want when they want without interfering with one another. This allows for more potential colocation clients at a more efficient usage and pricing point. In conjunction with the speed increases mentioned above, the Ampere upgrade is a unanimous win all around.
Get Expert Data Center Consulting
Call us Today