Each TPU is said to have a processing power of up to 180TFLOPs, but a custom high speed network allows them to be assembled in so called ‘pods’. Each pod contains 64 second generation TPUs, providing up to 11.5PFLOPs.
In a post on its website, Google noted ‘While our first TPU was designed to run machine learning models quickly … training a machine is more difficult than running it and days or weeks of computation on the best available CPUs and GPUs are commonly required to reach state of the art levels of accuracy’.
It adds that, using TPU pods, a large scale translation model that took a day to train using 32 commercially available GPUs can now train to the same accuracy in an afternoon using one eighth of a TPU pod.
Noting that many researchers don’t have access to such computing resources, Google is making 1000 Cloud TPUs, or pods, available at no cost to machine learning researchers via the TensorFlow Research Cloud, essentially providing on demand supercomputing.