In various AI benchmarks Graphcore is far ahead of Nvidia, mind you, with similar energy requirements and system price.
Colossus MK2 GC200
Graphcore has announced that the Colossus MK2 GC200 will be shipped, and just in time for the availability of all kinds of benchmarks of the AI accelerator published: If it is about artificial intelligence, the Colossus chip should be the A100 from Nvidia easily overtake, sometimes even by several factors.
For some of the measurements Graphcore uses the IPU-Pod64 with 16 slots, each 1U blade is equipped with four Colossus MK2 GC200. For comparison, Nvidia’s DGX-A100 is used in single or double form, even in the current version with the A100 with 80 GByte video memory instead of only half of it. Most benchmarks, however, position a single Colossus MK2 GC200 against a single A100 accelerator.
For Natural Language Processing with BERT the throughput should be 5.3x higher during training and 3.4x higher during inferencing. With Computer Vision via Resnet-50 training Graphcore is ahead by a factor of 2.6 and with Resnet-50 inferencing the performance should be 4.6 times higher. Also the image classification per Resnext-101 runs significantly faster according to Graphcore: In training it is a factor of 3.7x and the inferencing would even take place with 40 times the throughput and only a tenth of the latency.
Colossus MK2 GC200 against Nvidias A100 (Picture: Graphcore)
At this point, however, it should be mentioned that the Colossus MK2 GC200 with 823 mm² are gigantic chips, which contain an insanely large SRAM with 900 MByte to store data locally – in contrast, the 48 MByte of L2 in Nvidia’s A100 are almost tiny. With 59.4 billion instead of 54 billion transistors and also the N7 production of TSMC, both designs show great similarities, but the internal construction differs drastically and therefore also the performance. Because Nvidia’s A100 should also win certain AI benchmarks, but Graphcore doesn’t show this.
Read the original article here.