When I was young, the word, ‘supercomputer’ evoked images of powerful, intelligent systems, filling the cavities of mountains with their humming electronic menace.
Science fiction encouraged this view, which is as far from the (still impressive, yet grounded) reality of supercomputing as the Earth is from some distant galaxy. The distance between marketing hype and actually existing machines is like that: vast and unbridgeable, except in dreams.
Which brings me to this Verge story, posted on 24 January, 2022:
Social media conglomerate Meta is the latest tech company to build an “AI supercomputer” — a high-speed computer designed specifically to train machine learning systems. The company says its new AI Research SuperCluster, or RSC, is already among the fastest machines of its type and, when complete in mid-2022, will be the world’s fastest.
“Meta has developed what we believe is the world’s fastest AI supercomputer,” said Meta CEO Mark Zuckerberg in a statement. “We’re calling it RSC for AI Research SuperCluster and it’ll be complete later this year.”
Verge: https://www.theverge.com/2022/1/24/22898651/meta-artificial-intelligence-ai-supercomputer-rsc-2022
The phrase, “AI supercomputer” is obviously designed to sell the idea that this supercomputer, unlike others, is optimized for AI. And to give the devil his due, the fact it’s reportedly composed of NVIDIA game processing units, which, since the mid 2000’s have found extensive use powering tasks such as building large language models, gives some amount of credibility to the claim.
Some, but not as much as it might seem. Consider this hyperventilating article:
This is clearly the tone Meta (and others) is hoping to cultivate via the use of ‘AI supercomputer’ as a descriptor. The assumption is that if enough computational power is thrown at the task of building machine learning models, those models will, in some not sharply defined way, reach unprecedented heights of…well, one isn’t sure.
Are ever larger machine learning models a sure indicator of remarkable progress? Two papers, “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” and “No News is Good News: A Critique of the One Billion Word Benchmark” suggest the answer is no. These papers are focused on Natural Language Processing (NLP) models and it’s suggested that Meta will be building models for its Second Life warmed over ‘Metaverse’ effort. Even so, there appears to be a point at which ever larger models fail to produce hoped for results.
Supercomputers: Our Old Drinking Buddies
The category, ‘supercomputer’, created to describe a class of tightly integrated, high performance computational platforms, has existed for over 60 years. The first supercomputers were developed for nuclear research (weapons and energy) at Lawrence Livermore Labs in the US at the height of the Cold War (maybe we should call it Cold War Classic) and have also been applied to demanding tasks such as modeling the Earth’s climate. It’s a venerable technology with clearly defined parameters such as the use of symmetric multiprocessing. In all these decades, no supercomputer has managed to exhibit intelligence or plot our demise, except in fiction.
Adding ‘AI’ to the mix doesn’t change that reality since ever larger statistical pattern matching techniques do not cognition make. Oh and Meta’s claim is that these types of supercomputing data centers will, in addition to serving as development platforms, also host the haunted cartoon castle they call the “Metaverse’.
Considering this statement from Intel we have reason to doubt this too.