The world of tech has always run rife with direct competition. Whether it’s Apple vs. Microsoft, Betamax vs. VHS, Nintendo vs. Sega or Google vs. Yahoo, wherever there is innovation there are always heads butting. Indeed, it could be argued that it’s this competition that catalyses innovation and cultivates real lasting progress.
In the case of processors, the duelling sector leaders have always been Intel and AMD and it’s a rivalry with some serious history; a history that’s still being written. But whilst there has been plenty of discussion on the benefits and drawbacks of each manufacturer’s processors and chipsets in terms of gaming performance, there has been precious little discussion on how they both fare in artificial intelligence (AI) and deep learning (DL) applications.
Of course, there are a wealth of differences and similarities to pick through, some abundantly obvious and some a little less so. To gain some wider perspective, we’ll be focusing on the fundamentals in an effort to provide a balanced argument and a helpful guide for those still on the fence about which team to throw down their chips on.
Saving costs is always a major concern, particularly for R&D departments that might be putting together multiple rigs on a svelte budget. That’s why AMD has become so popular with everyone, from gamers to researchers, in recent years.
Traditionally, AMD has always been seen as the underdog in this story and, as is often the case with underdogs, their products have generally been priced more affordably. This has led to some corners of the market to believe that Intel chips are better simply because they’re more expensive, which is certainly not the case. At least, not anymore.
Whilst AMD chips might have initially been priced to match their lower performance when compared to Intel, in recent years, AMD has caught up with (and some might argue overtaken) their closest competitor in the value-for-money stakes as far as performance is concerned. Currently, however, when examining the price and performance of high-end CPUs, AMD Threadripper-based chips offer the greatest bang for your buck, at least in terms of sheer power.
Deep learning workstations require bandwidth, and lots of it, so one of the primary concerns when choosing your CPU is the number of PCIe lanes that are on offer. The PCIe lanes on your CPU are primarily assigned to your graphics card (GPU), and each of your graphics cards will require 16 PCIe lanes otherwise known as 16x, to run at 'full-speed'.
GPUs still do the majority of the heavy lifting within AI/DL workloads, so Deep learning workstations generally require a CPU with at least 40 PCIe lanes, as this will give more than enough bandwidth to run at least 2 GPUs simultaneously.
'Consumer-grade' CPUs, such as Intel's core range, or AMD's Ryzen chips will only offer you 16 PCIe lanes, so you really need to look at Intel's XEON lineup, which offers 40-64 lanes or if 64 lanes isn’t enough for you, then AMD's Threadripper or EPYC range, which provide up to 88 and 128 PCIe 4.0 lanes respectively (double the bandwidth of PCIe 3.0).
The number of cores is also an important consideration, as AI and DL are incredibly complicated and power-hungry workloads that require enough cores to pre-process datasets and run multiple applications and containers simultaneously.
AMD’s Threadripper packs up to a whopping 64 cores and 128 threads, which is beyond anything offered by Intel outside of the server space - it’s obvious why so many serious R&D divisions are turning to AMD over the more established ‘tentpole’ brand.
AMD’s prevalence in this sector has just been confirmed by NVIDIA, who recently chose AMD (its own major rival in the gaming sector) over Intel to provide the processors for its new DGX A100 deep learning system; specifically its EPYC server processors. Intel’s competitor (the Xeon) includes a much-heralded DL Boost feature, which drastically speeds up DL interferencing operations. But in terms of sheer power, AMD’s EPYC has them beat hands down.
Memory channels and disk I/O bandwidth are also considerations here – particularly the former, as RAM remains the cheapest way of maximising your processor's potential.
Of course, in recent weeks, it could be argued that the game has been changed by the introduction of Intel’s new “Cooper Lake” 3rd Gen Intel Xeon Scalable processors, which have been designed from the ground-up for AI and analytics workloads running in data centres, network and intelligent-edge environments.
The company boasts that these are the first chips with built-in ‘bfloat16’ support, which is a compact numeric format that uses half the bits of today’s FP32 format. In theory, that means it should be twice as fast as there is half as much to process. But these are early days yet and these chips have yet to be really put through their paces, so, right now, it’s all speculation and educated guesswork.
In the same breath, however, Intel suffered a major blow, as it was also revealed that Apple would cease utilising Intel processors in its computers and will instead be using its own ARM-based chips; the same core designs used by its iPhones and iPads. It would appear this is a move made to consolidate the Apple portfolio, with all systems running on the same basic architecture. From an AI/DL perspective, this is significant, as Apple’s A12 “Bionic” chip includes a so-called “Neural Engine,” which is specifically dedicated to machine learning tasks. For the foreseeable future, however, we imagine most researchers would prefer to stick with the devils they know.
More cores and more power are always better, right? You might think so, and well, you'd be mostly right, as pre-processing, on the whole, is a massively multi-threaded process. And of course, you need to ensure that the CPU has enough threads to feed the GPUs to prevent any bottlenecks.
Although, there are still certain cases where single-core speed counts are especially important. When calculations must be completed sequentially, for example, a CPU with less cores and a higher frequency is capable of out-performing a CPU with many slower cores.
Logic dictates that, Intel is your best bet for a higher frequency clock speed, and you should go with AMD for a higher core/thread count. But there are so many other factors to consider that it would be foolish to reduce the choice to such a binary concept.
One area in which Intel has AMD beat on AI/DL is on the software front. The proprietary deep learning software optimisation created by Intel that only runs on Intel CPUs is reason enough for some to avoid AMD altogether and then there’s the Intel Optane memory factor, which allows Intel CPUs to use Optane memory, which can hit higher performance than even the fastest DRAM SSDs. It’s also worth noting that motherboards built for gaming are generally easier to overclock, and this much is true whether you opt for AMD or Intel.
From a performance perspective, the consensus is that, whilst Intel processors generally contain fewer cores than their AMD counterparts, they provide a slightly stronger single-core performance and for many, that might be enough to swing it. With AMD, however, you’re getting more cores for your money and with many deep learning and AI frameworks requiring a heavier workload from our machines, sometimes raw power is really what’s needed.
So, would we say, in the battle of Intel vs. AMD, that the little guy has won for once? Not necessarily. There are still plenty of applications where Intel is the smarter choice and it remains to be seen how their Cooper Lake processors will perform in the coming months.
Currently, AMD’s Threadripper seems to be on track to offering increasingly powerful CPU options to industry professionals, but whilst playing catch-up, Intel might just overtake their rivals once again in the near future. Stranger things have happened in the always surprising world of tech, after all.