So, what are all these XPU things? And what is a chip supposed to do when it comes to AI?
One of the key components of this iteration of the rise of AI has been the hardware these models can now run on. None of the three types of chips are technically new, but they have played a crucial part of this new age, and I think it’s important to highlight what they are and how they work.
So, what are they?
The CPU (Central Processing Unit) is the brain of any modern device, your laptop, phone, and even your smartwatch all have one. It handles the instructions from your apps and operating system, performing calculations, making decisions, and controlling how data moves between different parts of your device. Without the CPU, nothing would work, it’s the essential engine behind every digital task.
Now, the GPU (Graphics Processing Unit) is a chip that its primary use was to help your laptop/device render (create) visuals. So, the “front-end” of your experience with your devices, is visually rendered by the GPU. The logic and information of it, isn’t. GPUs come in different “sizes”, meaning it can render better graphics, this is why a lot of gamers focus on getting a good GPU, it makes their game graphics a lot more designed.
Ok, when do I use what?
So, before we get into TPUs, I’d like to create an analogy to differentiate between when to use a CPU and when to use a GPU, because they serve different purposes.

<aside> 💡
CPUs are like professors: they excel at solving complex problems sequentially, one task at a time. Think of it like this: when you open an app or run a program, the CPU carefully works through the instructions one by one, just like a professor solving a complex problem step by step.
</aside>
<aside> 💡
GPUs are like hungover students: they shine when there are many simple tasks that can be solved in parallel. This is exactly why GPUs were originally used for graphics: rendering images means calculating the color and brightness of millions of pixels at once, a huge number of small tasks, done in parallel.
</aside>
It’s important to note that AI models not only need to be trained but also need to run every time you ask them something. These two phases are called training and inference.
When you train a model, it builds a complex network of nodes and assigns each connection a weight. These weights determine how much influence a particular input has on the model’s prediction. The training process involves making small adjustments to these weights, running calculations to see whether those changes improve the model’s performance, and repeating that process for millions (or billions) of times.
Sound familiar?