Twitter – Agonistes

There’s a temptation, if you are, or were, a Twitter user (and perhaps, even if you aren’t since we all must comment on everything, everywhere all the time now) to have an opinion about that platform and its current state.

For some, it’s a tale of paradise lost, of all yesterday’s parasocial parties, ruined by the jarring arrival of an off-putting, racist weirdo who, while lacking nearly all social skills, demands everyone’s attention. For others, there’s a deeper sense of impending loss; of online communities that were built against the odds and against the objections of a hostile world. And so, we have the agonies and ecstasies of Black Twitter and Philosophy Twitter and Literary Twitter and Trans Twitter and a universe of other groupings which came together (while also remaining open to other communities) as it all seems to be burning down.

Of course, I have my own Twitter story to tell which involves gaining some small degree of notice for my efforts dissecting the tech industry’s dangerous fantasies from a materialist, and indeed, Marxist perspective. For me, however, the larger concern, or really, observation, is that all of it – the good, the bad and the ugly of Twitter was built, like so many modern beliefs, upon a foundation of unreality.

What do I mean by unreality, what am I driving at? Here, we must take a detour to the past, borrowing a moment from Edward Gibbon´s Decline and Fall of the Roman Empire, (1782):

“To resume, in a few words, the system of the Imperial government; as it was instituted by Augustus, and maintained by those princes who understood their own interest and that of the people, it may be defined an absolute monarchy disguised by the forms of a commonwealth. The masters of the Roman world surrounded their throne with darkness, concealed their irresistible strength, and humbly professed themselves the accountable ministers of the senate, whose supreme decrees they dictated and obeyed.”

This fascinates me – the use of democratic forms to obscure tyranny; an “absolute monarchy disguised by the forms of a commonwealth.” When I think of the tech industry which, until very, very recently, was almost universally hailed as a sun kissed road to ‘The Future’, that vaguely defined territory, always just over the horizon, this potent phrase comes to mind.


When Elon Musk took command of Twitter, arriving at the company’s San Francisco office carrying a sink in a typically poor attempt at humor, we recoiled in keyboard-conveyed horror, waiting for the bad times to come. We all know what happened next: the mass firings of key people in moderation, compliance, software and data center infrastructure, and also, anyone who knows what to do with a bathroom fixture. This sort of anti-worker action, common in most other sectors (though not always quite so haphazardly) came as a shock to those in, and observers of, that shiny Mordor, the tech sector’s Silicon Valley heartland (particularly those who forgot, or weren’t around for the dot com crash of 2000).

As Musk smashed his way through a complex system and tweeted like the synthesis of an angrily divorced uncle and a 14 year old manifesto writer, revealing in near real time his unsuitability for the role of CEO (or even to lead a bake sale) some of us thought: if only another, more competent and nicer person took the reigns; if only the terrible billionaire with his Saudi funders and sweaty style of presentation, could be replaced by that most hallowed of modern types, a professional, a good CEO who cared about Twitter as a ‘town square.’

Given the severe limitations of our barbarous era, a time in which we’re told that it’s easier to imagine the end of the world than the end of capitalism, it’s not surprising that our most commonly proposed solution to the problem of bad, even destructive management of a social media platform is its replacement by good management – still within the framework of privately owned companies – that is, a capitalist solution to a capitalist problem.

At the heart of the Musk problem (and the Dorsey problem before it, and the Google problem, and on and on) is the reality these platforms are not subject to democratic control and not answerable – except in a crude market feedback sense – to the needs of the people using them. We cry out for a better CEO, a better billionaire because the actual solution, that these platforms not be private at all but public utilities we control as citizens, not as consumers, has been purged from our minds as a possibility, let alone a goal (we’ll talk about Mastodon another time).

We have been trained, to borrow once again from Gibbon, to accept “absolute monarchy disguised by the forms of a commonwealth”. The ‘commonwealth’ disguise in this case, being the idea of a tech industry which, alone amongst capitalist sectors, somehow has our best interest at heart because… well, one isn’t sure; perhaps all the nice words about inclusion, expensive t-shirts, and California sunshine, shining down on the forgotten bones of the murdered indigenous population, oil rigs and hidden industrial waste.

Magic is an Industrial Process, Belching Smoke and Fire: On GPUs

AT THE END of ´The Wizard of OZ´, Metro-Goldwyn-Mayer´s 1939-released, surrealist musical fantasy, our heroine Dorothy and her loyal comrades complete a long, arduous (but song filled) journey, finally reaching the fabled city of OZ. In OZ, according to a tunefully stated legend, there’s a wizard who possesses the power to grant any wish, no matter how outlandish. Dorothy, marooned in OZ, only wishes to return home and for her friends to receive their various hearts´ desire.

Who Dares Approach Silicon Valley!

As they cautiously approach the Wizard’s chamber, Dorothy and her friends are met with a display of light, flame and sound; ¨who dares!?¨ a deafening voice demands. It’s quite a show of apparent fury but illusion crumbles when it’s revealed (by Dorothy´s dog, Toto) that behind it all is a rather ordinary man, hidden on the other side of a velvet curtain, frantically pulling levers and spinning dials to keep the machinery powering the illusion going while shouting, “pay no attention to that man behind the curtain!

Behind the appearance of magic, there was a noisy industrial process, belching smoke. Instead of following the Wizard’s advice to pay no attention, let’s pay very close attention indeed to what lies behind appearances.


THERE’S AN INESCAPABLE MATERIALITY behind what’s called ‘AI’ deliberately obscured under a mountain of hype, flashy images and claims of impending ‘artificial general intelligence’ or ‘AGI’ as it’s known in sales brochures disguised as scientific papers.

At the heart of the success of techniques such as large language models, starting in the latter 2010s, is the graphics processing unit or GPU (in this essay about Meta´s OPT-175B, I provide an example of how GPUs are used). These devices use a parallel architecture, which enables greater performance than the general purpose processors used for your laptop; this vastly greater capability is the reason GPUs are commonly used for demanding applications such as games and now, the hyper-scale pattern matching behind so-called ´AI´ systems.

Typical GPU Architecture – ResearchGate

All of the celebrated feats of ‘AI’ – platforms such as Dall-E, GPT-3 and so on, are completely dependent on the use of some form of GPU, most likely provided by NVIDIA, the leading company in this space. OpenAI, a Microsoft partner, uses that company’s Azure cloud but within those ´cloud´ data centers, there are thousands upon thousands of GPUs, consuming power and requiring near constant monitoring to replace failed units.

GPUs are constructed as the result of a long and complex supply chain involving resource extraction, manufacturing, shipping and distribution; even a sales team.  ‘AI’ luminaries and their camp followers, the army of bloggers, podcasters and researchers who promote the field, routinely and self-indulgently debate a variety of esoteric topics (if you follow the ´AI´ topic on Twitter, for example, odds are you have observed and perhaps participated in these discussions about vague topics such as, ´the nature of intelligence´) but it’s GPUs and their dependencies all the way down

GPU raw and processed material inputs are aluminum, copper, clad laminates, glass, fibers, thermal silica gel, tantalum and tungsten. Every time an industry partisan tries to ‘AI’-splain the field, declaring it to be a form of magic, ignore their over-determination and confusion of feedback loops with cognition and think of those raw materials, ripped from the ground.

Aluminum mining

The ‘AI’ industrial complex is beset by two self-serving fantasies: 

1.) We are building intelligence 

2.) The supply chain feeding the industry is infinite and can ‘scale is all you need’ its way forever to a brave new world. 

For now, this industry has been able to keep the levers and dials moving,  but the amount of effort required will only grow as the uses to which this technology is put expand (Amazon alone seems determined to find as many ways to consume computational infrastructure as possible with a devil take the hindmost disregard for consequences), the need for processors grows and global supply chains are stressed by factors such as climate change, and geopolitical fragmentation.

The Wizards, out of tricks, curtains pulled, will be revealed as the ordinary (mostly) men they are. What comes next, will be up to us.

Some Key References:

Wizard of Oz

Dall-E

GPT-3

GPU Supply Chain

NIVIDIA

On Kludges

As I type this, it’s hot here in Amsterdam. In the past, one might have said ‘unseasonably warm’ a phrase that, in our current circumstances, seems like a form of wishful thinking, an echo of an earlier time – not more innocent but not as burdened with the hyper-problem of C02. I mention the temperature, and add the fact that I don’t have air conditioning (because traditionally, it wasn’t needed), to set expectations; this may not be the sharpest bit of work. But then again, perhaps I’ll rise, like the temperature, to the occasion.

But enough preamble.

In a recent edition of educator and angry Marxist uncle Derrick Varn’s ever excellent Youtube program, Varn Vlog, the subject of ‘kludges’ is discussed. Merriam Webster states that  a kludge is “a haphazard or makeshift solution to a problem and especially to a computer or programming problem” which is precisely right. Varn builds on the theme of kludges to explain the provisional character of modern systems – not just technical but bureaucracies, corporations, and so on.

I know a thing or two about kludges, having worked in the technology industry for decades – an industry that is essentially a massive ziggurat of kludges, covered by a polished surface to hide the stone knives, bear skins, rubber bands, glue and endless scurrying about. Apple, for example, pretends to run with the smooth efficiency of a Borg cube but is really an assemblage of various kludges, deployed to increase market share and profitability (the only real goals).

Now I’m going to tell you a brief story about a kludge I was compelled to put into place, forced, as Varn would note, by path dependencies. The story’s point isn’t to elicit sympathy (or perhaps, considering the system I’m describing, horror) but to give you a glimpse into just how right Varn is and how sloppy things can be. This is only one of many such stories I could tell.

The Saga of Phil’s Server

Once upon a time, never mind how many years ago, I was consulting at an energy company, let’s call it SPARK which, in addition to owning a variety of power generation systems – hydro, fossil, nuclear, across the continental US – also had an energy futures trading division. This division, which I’ll call SPARK-HYPERION because it captures the degree of self regard and the amount of money generated (many billions) was responsible for calculating the available excess generating capacity of SPARK’s fleet of assets alongside weather conditions in various markets and the correspondingly forecasted need. SPARK is part of the PJM Interconnection network which makes it possible to send power between regions.

PJM Interconnect – Example Data

Here’s the scenario: 

Let’s say that a weather event in a neighboring region (perhaps a heat wave) increases electricity demand in excess of that region’s capacity. Through the interconnection of regions, SPARK could send spare power from its assets to the region in need, but of course, for a price, generating a profit from the trade. The forecasting of potential need in neighboring regions, based on a combination of real-time weather satellite data, and real-time power generation data was considered a key strategic capability and millions were spent on keeping this at a state of the art level (one project I led was creating a method for distributing the computational requirements for analysis across the spare capacity of idle office PCs at night – that was fun).

Forecasting.

Keep this word in mind because it explains what happens next in our story.

Energy futures traders needed data from power plants to determine what was available on the market. This is why Phil (of course, not his real name) had access to a live feed of the megawatt output of a nuclear power plant that was part of SPARK’s generating portfolio. What Phil’s requirement didn’t explain was the reason this system, which connected to the nuke plant’s SCADA, showing, via a web interface, coolant levels and other critical things, was under his desk.  What his job requirement also didn’t explain was why that system was available to pretty much everyone on the corporate network. Just drop the address of the server into your browser and poof! Instant access to nuke plant data. 

The Tru64 Unix system that connected to a nuclear power plant was under Phil’s desk, within easy reach of anyone strolling by with their terrible office coffee; not in a data center.  That was the first kludge; a rushed together ‘solution’ designed to give Phil the data he needed with minimal latency but also, as a knock on effect, minimal security. I discovered this troublesome computer during a security assessment of the corporate network using a Nessus vulnerability scanner system I created from a spare PC running the Linux operating system. There I was, sitting at my desk, sipping tea like a character in a BBC murder mystery. The results showed a system, on the corporate network, running a web server. How interesting. I browsed to the site, saw the status of a nuclear power plant, and nearly spat out my tea. Quietly, I walked into the office of the VP of information technology. ‘If you don’t want an unpleasant visit from Homeland Security and the Federal Energy Regulatory Commission‘, I started, ‘I suggest you listen to what I’m about to tell you.’ What a marvelous, completely normal day.

The second kludge was from me, a forced mitigation compelled by path dependencies including palpable executive fear of disrupting, for even the shortest of moments, Phil’s multimillion dollar generating workflow by moving the system to where it should have been all along, the data center (no reader, not even an after hour or weekend move was permitted – no one wanted to be the exec who said yes to that in case anything went wrong). I couldn’t move the system to a more secure location, with all that would have meant for enhanced monitoring and control so, of necessity, I had to bring more security to Phil’s desk.

I received authorization to install a multi-thousand dollar Cisco firewall, designed to sit comfortably in a professionally managed data center,  providing network security services to hundreds if not thousands of systems, under Phil’s already busy desk. This was a kludge on top of a kludge. Ladies and gentlemen, this was a multi-billion dollar firm.

Classic Cisco Network Topology

A Fable of Competence

In modern mythology, by which I mean marketing, technologies are deployed in companies with a cool competence building on past perfection with new perfection: shiny and flawless. In reality, despite our best efforts, complex systems accrue debts: past compromises force new compromises to ensure the entire system continues to function. Keep this in mind the next time you think about your bank or credit card company or Meta or Google or the world as a whole.

Grapes of Metallic Wrath?

There are words, such as ‘freedom’ and ‘democracy’ which get tossed around like a cat’s toy, tumbling between meanings depending on the speaker. To this list of floating signifiers, we can add ‘automation’ which, when mentioned by business types, is meant to depict a bright and shining future but, when used by people on the left who, one would hope, are concerned about the prospects for labor, is typically employed as a warning of trouble ahead (there’s an exception to this: the ‘fully automated luxury communism’ folks, some of whom, seeing their robot butler dreams fade, are now turning to the polar opposite idea of degrowth).

The trouble with floating signifiers is that they float, making it difficult to speak, and perhaps think, with precision – actually, this also explains their appeal; any malcontent can shout they’re defending ‘freedom’ and fool at least some of the people, some of the time, via a socially agreed-upon vagary.

One of my quixotic preoccupations is a struggle against imprecise language and thought. It’s silly; we’re all over the place as a species and wouldn’t be human if it were otherwise (among my many arguments against the ‘AI’ industry crowd is its collective failure to understand that imprecision is a key element of our cognition, beyond duplication in electronic machinery)

So, with my quest for precision in mind, let’s spend a few moments contemplating automation, trying to put some bones and flesh on an ideological mist.

Check out the graphic shown below:

The Things to Think About and Study

I cooked up this image to visualize what I see as the appropriate areas of material concern for left politics. How do things work? And, for me, because this is my area of expertise, what role does computation play in the performance and command and control of labor in these various sectors of production?

In this post, I focus on automation in farming. Oh and by the way, my focus here is also on method, on how to think; that is, how to think in material terms about things which are presented in vague ways. 

Drones, Robot Tractors and Harvestors 

For me, the foundational, 21st century work on the real-world impacts of automation on labor is ‘Automation and the Future of Work’ by Aaron Benanav. Here’s a link to an article Benanav wrote for the New Left Review outlining his argument which can be summarized as: yes, of course, there’s automation and it has an impact but not as profound and far reaching, and not in the ways we are encouraged to think.

To look at farming specifically, I visited PlugandPlay, an industry and venture capitalist boosting website (trade publications, properly analyzed, are an excellent source of information) that published “How Automation is Transforming the Farming Industry”. 

From the article:

Farm automation, often associated with “smart farming”, is technology that makes farms more efficient and automates the crop or livestock production cycle. An increasing number of companies are working on robotics innovation to develop drones, autonomous tractors, robotic harvesters, automatic watering, and seeding robots. Although these technologies are fairly new, the industry has seen an increasing number of traditional agriculture companies adopt farm automation into their processes.”

https://www.plugandplaytechcenter.com/resources/new-agriculture-technology-modern-farming/

You can imagine a futuristic farm, abuzz with robotic activity, all watched over, to paraphrase the poet Richard Brautigan, by machines of sublime grace, producing the food we need while the once over-worked farmer relaxes in front of a panel of screens watching devices do all the labor.

Let’s dig a little deeper to list the categories of systems mentioned in the article:

  • Drones
  • Autonomous tractors
  • Robotic harvesters
  • Automatic watering
  • Seeding robots

For each of these categories, the PlugandPlay article, very helpfully, provides an example company. This gives us an opportunity to review the claims, methods and production readiness (i.e., can you buy a product and receive shipment and technical support for setup or are only pre-orders available?) of individual firms in each area of activity. This information enables us to add more precision to our understanding.

With this information at-hand, we’re not just saying ‘farming automation’ we’re looking at the sector’s operational mechanics.

For drones, American Robotics’ aerial survey systems are mentioned. As is my habit, I checked out their job listings to see the sort of research and engineering efforts they’re hiring for which is a solid indicator of real or aspirational capabilities. I’ve written about drone-based analysis before; it does have real world applications but isn’t as autonomous as often claimed.

The three examples of robotic harvesters listed are from Abundant Robotics, which is building specialized apple harvesting systems, Bear Flag Robotics, which seems to have retrofitted existing tractors with sensors to enable navigation through farming fields (and perhaps remote operation, the marketing material isn’t very clear about this) and Rabbit Tractors, which appears to be out of business.

There are a few other examples offered but hopefully, a picture is forming; there are, at this point, some purpose built systems – some more demonstration platform than production ready – which show the limitations, and potential usefulness of automation in the farming sector: perfect for bounded, repetitive applications (a weed sprayer that follows assigned paths comes to mind) not so great at situations requiring flexible action. Keep this principle in mind as a rule of thumb when evaluating automation claims.

It also isn’t clear how well any of these systems work in varying weather conditions, what the failure modes and maintenance schedules are and lots of other critical questions. It may seem cheaper, in concept, to replace workers with automated or semi-automated harvesters (for example) but these machines aren’t cheap and introduce new cost factors which may complicate profitability goals and it follows, adoption by agribusiness, which, like all other capitalist sectors, is always in search of profits.

So, yes, automation is indeed coming to, or is already present in farming but not, it appears, in the hands-off, labor smashing way we tend to think of when the word, ‘automation’ is tossed around, like a cat’s toy.

Next, time, I’ll take a look at automation in logistics. How far has it gone? How far will it go? 

How to Interpret Tech Propaganda (the case of the machine gun toting robot dog)

Usually, I try to start these essays with an anecdote to lead you, the esteemed reader, into my topic. These anecdotes lead me into a subject too; a warm up to get the writing process flowing.

For this brief essay, which is about yet another video posted to Twitter about a supposedly autonomous killing machine, I’m thinking of the classic shell game, which is described in this (obligatory) Wikipedia article:

The shell game (also known as thimblerig, three shells and a pea, the old army game) is often portrayed as a gambling game, but in reality, when a wager for money is made, it is almost always a confidence trick used to perpetrate fraud. In confidence trick slang, this swindle is referred to as a short-con because it is quick and easy to pull off. The shell game is related to the cups and balls conjuring trick, which is performed purely for entertainment purposes without any purported gambling element.

https://en.wikipedia.org/wiki/Shell_game

A confidence trick, a fraud. What do I mean when I use these words to describe the machine gun toting, robot ‘dog’?

After all, there is the machine and it’s gun. Where is the game?

Let’s look at the original post:

Tweet showing video of robot dog

The video shows a machine, very similar to Boston Dynamics’ Spot (which I discuss in this essay, Boston Dynamics, A Brief Inquiry) moving on a course with its spindly legs, firing an automatic weapon.

It’s implied that what we’re seeing is, as in that often cited Black Mirror episode ‘Metal Head‘, an autonomous machine which can roam on its own, killing people using some form of silicon intelligence, tuned for lethality.

What we’re really seeing is a remote controlled system, whose true purpose is obscuring the bullet’s source, the hand pulling the trigger.

The evidence is in the video.

The Controller and The Controlled

Take a close look at what we see in this still excerpt; a military transport vehicle, sitting idly by for no apparent reason. This is the controller, hidden in plain sight (the operator could just as easily have been outside) to give the appearance of autonomy.

The relationship between controller and machine is no doubt more or less what you see in this image of New York Mayor Eric Adams controlling a Boston Dynamics spot:

Now let’s take a closer look at the ‘dog’ unit:

A grounded, materialist, less science fiction informed examination of this image tells the story (well that, and the fact there is no machine Intelligence and certainly not within the form factor of this device): this is a remote controlled device, a drone. What appears to be a VHF whip antenna is clearly visible along with control interfaces and a camera for navigation.

One additional bit of information can be found in this image:

This shot, probably by accident because someone thought it was cool, reveals what’s behind the curtain: the camera’s view is directly of the gunsight which is certainly what the controller, sitting in the military transport vehicle, sees via a display. The robot ‘dog’ though it exhibits dangerous potential, is not the harbinger of a new form of self-directed killbot but rather, the harbinger of a new class of remote controlled drone, designed, like their UAV cousins, to obscure culpability.

What is the True Danger?

The 21st century isn’t going well.

In addition to climate change, the lingering possibility of nuclear war and the unraveling of neoliberal capitalism which, at the height of its power as a social form, was sold as being history’s last stage, we face the coming to earth of the military drone, long a menace to people around the world and arriving, as all military ideas eventually do, to a street near you.

So, we should agree there is a danger. But it’s not the science fiction danger of sinister machines, free of human control. It’s the danger of remote operated systems, used to harass and kill people while obscuring the source of this harassment and death. It’s easy to imagine a scenario: someone is killed by a police officer but the tools of body cams and eyewitness testimony are removed; the device from which the bullets flew is controlled by an unseen operator, indemnified from responsibility like the drone operators remotely flying machines over contested territory.

Earlier I mentioned the shell game which is this: the sleight of hand, performed via carefully shot marketing material, which leads our thoughts away from who is pulling the trigger into talking endlessly, and in terrified circles, about the same, tired science fiction tropes.

It’s time to put Black Mirror away to see the true danger taking shape, right before our eyes.

AI Ethics, a More Ruthless Consideration

According to self-satisfied legend, medieval European scholars, perhaps short of things to do compared to we ever-occupied moderns, spent countless hours wondering about topics such as, how many angels could simultaneously occupy the head of a pin; the idea being that, if nothing was impossible for God, surely, violating the observable rules of space and temporality should be cosmic child’s play for the deity…but to what extent?

How many angels, oh lord?

Although it’s debatable whether this question actually kept any monks up at night more than, say, wondering where the best beer was, the core idea, that it’s possible to get lost in a maze of interesting, but ultimately pointless inquiries (a category which, in an ancient Buddhist text is labeled, ‘questions that tend not towards edification’) remains eternally relevant.


At this stage in our history, as we stare, dumbfounded, into the barrels of several weapons of capitalism’s making – climate change being the most devastating – the AI endeavor is the computational equivalent of that apocryphal medieval debating topic; we are discussing the ethics of large language models, focusing, understandably, on biased language and power consumption but missing a more pointed ethical question: should these systems exist at all? A more, shall we say, robust ethics would demand that in the face of our complex of global emergencies, tolerance for the use of computational power for games with language cannot be justified.


OPT-175B – A Lesson: Hardware

The company now known as Meta recently announced its creation of a large language model system called OPT-175B. Helpfully, and unlike the not particularly open OpenAI, the announcement was accompanied by the publication of a detailed technical review, which you can read here.

As the paper’s authors promise in the abstract, the document is quite rich in details which, to those unfamiliar with the industry’s terminology and jargon, will likely be off-putting. That’s okay because I read it for you and can distill the results to four main items:

  1. The system consumes almost a thousand NVIDIA game processing units (992 to be exact, not counting the units that had to be replaced because of failure)
  2. These processing units are quite powerful, which enabled the OPT-175B team to use relatively fewer computational resources than what was installed for GPT-3 another, famous (at least in AI circles) language model system
  3. OPT-175B, which drew its text data from online sources, such as that hive of villainy, Reddit, has a tendency to output racist and misogynist insults
  4. Sure, it uses fewer processors but its carbon footprint is still excessive (again, not counting replacements and supply chain)

Here’s an excerpt from the paper:

From this implementation, and from using the latest generation of NVIDIA hardware, we are able to develop OPT-175B using only 1/7th the carbon footprint of GPT-3. 

While this is a significant achievement, the energy cost of creating such a model is still nontrivial, and repeated efforts to replicate a model of this size will only amplify the growing compute footprint of these LLMs.” [highlighting emphasis mine]

https://arxiv.org/pdf/2205.01068.pdf

I cooked up a visual to place this in a fuller context:

Here’s a bit more from the paper about hardware:

We faced a significant number of hardware failures in our compute cluster while training OPT-175B. 

In total, hardware failures contributed to at least 35 manual restarts and the cycling of over 100 hosts over the course of 2 months. 

During manual restarts, the training run was paused, and a series of diagnostics tests were conducted to detect problematic nodes.

Flagged nodes were then cordoned off and training was resumed from the last saved checkpoint. 
Given the difference between the number of hosts cycled out and the number of manual restarts, we estimate 70+ automatic restarts due to hardware failures.”

https://arxiv.org/pdf/2205.01068.pdf

All of which means that, while processing data, there were times, quite a few times, when parts of the system failed, requiring a pause till fixed or routed around (resumed, once the failing elements were replaced).

Let’s pause here to reflect on where we are in the story; a system, whose purpose is to produce plausible strings of text (and, stripped of the obscurants of mathematics, large-scale systems engineering and marketing hype, this is what large language models do) was assembled using a small mountain of computer processors, prone, to a non-trivial extent, to failure.

As pin carrying capacity counting goes, this is rather expensive.

OPT-175B – A Lesson: Bias

Like other LLMs, OPT-175B has a tendency to return hate speech as output. Another excerpt:

Overall, we see that OPT-175B has a higher toxicity rate than either PaLM or Davinci. We also observe that all 3 models have increased likelihood of generating toxic continuations as the toxicity of the prompt increases, which is consistent with the observations of Chowdhery et al. (2022). As with our experiments in hate speech detection, we suspect the inclusion of unmoderated social media texts in the pre-training corpus raises model familiarity with, and therefore propensity to generate and detect, toxic text.” [bold emphasis mine]

https://arxiv.org/pdf/2205.01068.pdf

Unsurprisingly, there’s been a lot of commentary on Twitter (and no doubt, elsewhere) about this toxicity. Indeed, almost the entire focus of ‘ethical’ efforts has been on somehow engineering this tendency away – or perhaps avoiding it altogether via the use of less volatile datasets (and good luck with that as long as Internet data is in the mix!)

This defines ethics as being the task of improving a system’s outputs – a technical activity – and not a consideration of a system as a whole from an ethical standpoint within political economy. Or to put it another way, the ethical task is narrowed to making sure that if I use a service which, on its backend, depends on a language model for its apparent text capability, it won’t in the midst of telling me about good nearby restaurants, hurl insults like a klan member.

OPT-175B – A Lesson: Carbon

Within the paper itself, there is the foundation of an argument against this entire field, as currently pursued:

“...there exists significant compute and carbon cost to reproduce models of this size. While OPT-175B was developed with an estimated carbon emissions footprint (CO2eq) of 75 tons,10 GPT-3 was estimated to use 500 tons, while Gopher required 380 tons. These estimates are not universally reported, and the accounting methodologies for these calculations are also not standardized. In addition, model training is only one component of the over- all carbon footprint of AI systems; we must also consider experimentation and eventual downstream inference cost, all of which contribute to the growing energy footprint of creating large-scale models.”


A More Urgent Form of Ethics

In the fictional history of the far-future world depicted in the novel ‘Dune’ there was an event, the Butlerian Jihad, which decisively swept thinking machines from galactic civilization. This purge was inspired by the interpretation of devices that mimicked thought or possessed the capacity to think as an abomination against nature.

Today, we do not face the challenge of thinking machines and probably never will. What we do face however, is an urgent need to, at long last, take climate change seriously. How should this reorientation towards soberness alter our understanding of the role of computation?

I think that, in face of an ever-shortening amount of time to address climate change in an organized fashion, the continuation, to say nothing of expansion of this industrial level consumption of resources, computing power, talent and the corresponding carbon footprint is ethically and morally unacceptable.

At this late hour, the ethical position isn’t to call for, or work towards better use of these massive systems; it’s to demand they be halted and the computational capacity re-purposed for more pressing issues.  We can no longer afford to wonder how many angels we can get to dance on pins.

A Materialist Approach to the Tech Industry

[In this post, Monroe thinks aloud about his approach to analyzing the tech industry, a term which, annoyingly, is almost exclusively used to describe Silicon Valley based companies that use software to create rentier platforms and not, say, aerospace and materials science firms. The key concept is materialism.]


Few industries are as shrouded by mystification as the tech sector, defined as that segment of the industrial and economic system whose wealth and power have been built by acting as the unavoidable foundation of all other activity, by building rentier software-based platforms, shielded by copyright, that are difficult, indeed, impossible, to circumvent (an early example is the method Microsoft used to extract, via its monopoly position in corporate desktop software, what was called the ‘Microsoft or Windows tax‘).

Consider, as a contrasting example, a paper clip company: if it was named something self-consciously clever, such as Phase Metallics, it wouldn’t take long for most of us to see through this vainglory to say: ‘calm down, you make paper clips’.

An instinctual grounding of opinion, shaped and informed by the irrefutable physicality of things like paper clips, is lacking when we assess the claims of ‘tech’ companies. The reason is because the industry has successfully obscured, with a great deal of help from the tech press and media generally, the material basis of its activities. We use computers but do not see the supply chains that enable their production as machines. We use software but are encouraged to view software developers (or ‘engineers’, or ‘coders’) as akin to wizards and not people creating instruction sets.

Computers and software development are complex artifacts and tasks but not more complex than physics or civil engineering. We admire the architects, engineers and construction workers who design and build towering structures but, even though most of us don’t understand the details, we know these achievements have a physical, material basis and face limitations imposed by nature and our ability to work within natural constraints.

The tech sector presents itself as being outside of these limitations and most people, intimidated by insider jargon, the glamour of wealth and the twin delusions of techno-determinism (which posits a technological development as inevitable) and techno-optimism (which asserts there’s no limit to what can be achieved) are unable to effectively counter the dominant narrative.

Lithium Mine – extracting a key element used in computing

The tech industry effectively deploys a degraded form of Platonic idealism (which places greater emphasis on our ideas of the world than the actually existing structure of the world itself). This idealism prevents us from thinking clearly about the industry’s activities and its role in, and impact on, global political economy (the interrelation of economic activity with social custom, legal frameworks, government, and power relations). One of the consequences of this idealist preoccupation is that, when we’re analyzing a press account of tech activities, for example, stories about autonomous cars, instead of interrogating the assumption that driverless vehicles are possible and inevitable, we base our analysis on an idealist claim, thereby going astray and inadvertently allowing our class adversaries to define the boundaries of discussion.

The answer to this idealism, and the propaganda crafted using it, is a materialist approach to tech industry analysis.

Materialism (also known as physicalism)

Let’s take a quote from the Stanford Encyclopedia of Philosophy

Physicalism is, in slogan form, the thesis that everything is physical. The thesis is usually intended as a metaphysical thesis, parallel to the thesis attributed to the ancient Greek philosopher Thales, that everything is water, or the idealism of the 18th Century philosopher Berkeley, that everything is mental. The general idea is that the nature of the actual world (i.e. the universe and everything in it) conforms to a certain condition, the condition of being physical. Of course, physicalists don’t deny that the world might contain many items that at first glance don’t seem physical — items of a biological, or psychological, or moral, or social, or mathematical nature. But they insist nevertheless that at the end of the day such items are physical, or at least bear an important relation to the physical.

Stanford Encyclopedia of Philosophy – https://plato.stanford.edu/entries/physicalism/

This blog is dedicated to ruthlessly rejecting tech industry idealism in favor of tracking the hard physicality and real-world impacts of computation in all of its flavors. In this sense, the focus is materialist. Key concerns include:

  • Investigating the functional, computational foundation of platforms, such as Apple’s walled garden and Facebook
  • Exploring the physical inputs into the computational layer and the associated costs (in ecological, political economy and societal impact terms)
  • Asking who, and what factors shape the creation and deployment of software at-scale – i.e., what is the relationship between software and power

This blog’s analytical foundation is unequivocally Marxist and seeks to employ Marx and Engel’s grounding of Hegelian dialectics (an ongoing project, subject to endless refinement as understanding improves):

Marx’s criticism of Hegel asserts that Hegel’s dialectics go astray by dealing with ideas, with the human mind. Hegel’s dialectic, Marx says, inappropriately concerns “the process of the human brain”; it focuses on ideas. Hegel’s thought is in fact sometimes called dialectical idealism, and Hegel himself is counted among a number of other philosophers known as the German idealists. Marx, on the contrary, believed that dialectics should deal not with the mental world of ideas but with “the material world”, the world of production and other economic activity.[19] For Marx, a contradiction can be solved by a desperate struggle to change the social world. This was a very important transformation because it allowed him to move dialectics out of the contextual subject of philosophy and into the study of social relations based on the material world.

Wikipedia “Dialectical Materialism” – https://en.wikipedia.org/wiki/Dialectical_materialism

This blog is, therefore, dedicated to finding ways to apply the Marx/Engels conceptualization of materialism to the tech industry.

Conclusion

When I started my technology career, almost 20 years ago, like most of my colleagues, I was an excited idealist (in both the gee whiz and philosophical senses of the term) who viewed this burgeoning industry as breaking old power structures and creating newer, freer relationships (many of us, for example, really thought Linux was going to shatter corporate power just as some today think ‘AI’ is a liberatory research program).

This was an understandable delusion, the result of youthful enthusiasm but also, the hegemonic ideas of that time. These ideas – of freedom, ‘innovation’ and creativity are still deployed today but like crumbling Roman ruins, are only a shadow of their former glory.

The loss of dreams can lead to despair, but, to paraphrase Einstein, if we look deeply into the structures of things as they are, instead of as we want them to be, instead of despair, we can feel a new type of invigoration, the falling away of childlike notions and a proper identification of enemies and friends.

A materialist approach to the tech industry removes the blinders from one’s eyes and reveals the full landscape.

AI Supercomputers, An Inquiry

When I was young, the word, ‘supercomputer’ evoked images of powerful, intelligent systems, filling the cavities of mountains with their humming electronic menace.

Science fiction encouraged this view, which is as far from the (still impressive, yet grounded) reality of supercomputing as the Earth is from some distant galaxy. The distance between marketing hype and actually existing machines is like that: vast and unbridgeable, except in dreams.

Which brings me to this Verge story, posted on 24 January, 2022:

Social media conglomerate Meta is the latest tech company to build an “AI supercomputer” — a high-speed computer designed specifically to train machine learning systems. The company says its new AI Research SuperCluster, or RSC, is already among the fastest machines of its type and, when complete in mid-2022, will be the world’s fastest.

“Meta has developed what we believe is the world’s fastest AI supercomputer,” said Meta CEO Mark Zuckerberg in a statement. “We’re calling it RSC for AI Research SuperCluster and it’ll be complete later this year.”

Verge: https://www.theverge.com/2022/1/24/22898651/meta-artificial-intelligence-ai-supercomputer-rsc-2022

The phrase, “AI supercomputer” is obviously designed to sell the idea that this supercomputer, unlike others, is optimized for AI. And to give the devil his due, the fact it’s reportedly composed of NVIDIA game processing units, which, since the mid 2000’s have found extensive use powering tasks such as building large language models, gives some amount of credibility to the claim.

Some, but not as much as it might seem. Consider this hyperventilating article:

“Mind boggling”

This is clearly the tone Meta (and others) is hoping to cultivate via the use of ‘AI supercomputer’ as a descriptor. The assumption is that if enough computational power is thrown at the task of building machine learning models, those models will, in some not sharply defined way, reach unprecedented heights of…well, one isn’t sure.

Are ever larger machine learning models a sure indicator of remarkable progress? Two papers, “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” and “No News is Good News: A Critique of the One Billion Word Benchmark” suggest the answer is no. These papers are focused on Natural Language Processing (NLP) models and it’s suggested that Meta will be building models for its Second Life warmed over ‘Metaverse’ effort. Even so, there appears to be a point at which ever larger models fail to produce hoped for results.

Supercomputers: Our Old Drinking Buddies

Schematic of Typical Supercomputing Infrastructure from ResearchGate

The category, ‘supercomputer’, created to describe a class of tightly integrated, high performance computational platforms, has existed for over 60 years. The first supercomputers were developed for nuclear research (weapons and energy) at Lawrence Livermore Labs in the US at the height of the Cold War (maybe we should call it Cold War Classic) and have also been applied to demanding tasks such as modeling the Earth’s climate. It’s a venerable technology with clearly defined parameters such as the use of symmetric multiprocessing. In all these decades, no supercomputer has managed to exhibit intelligence or plot our demise, except in fiction.

Adding ‘AI’ to the mix doesn’t change that reality since ever larger statistical pattern matching techniques do not cognition make. Oh and Meta’s claim is that these types of supercomputing data centers will, in addition to serving as development platforms, also host the haunted cartoon castle they call the “Metaverse’.

Considering this statement from Intel we have reason to doubt this too.

On Niceness as a Tactical Failure

Attentive readers will note that this blog is primarily focused on dissecting and highlighting the political economy and social impact of what’s called ‘Artificial Intelligence’ and allied fields (such as supposedly autonomous robots).

Till now, crypto, in all its fetidness, has escaped comment on these virtual pages. Well, ‘needs must’ as the old saying goes: the increasingly loud chorus of people – some of whom are well-intentioned techies – either singing the praises of crypto, NFTs, web3 etc. or, offering a lukewarm response to the threat this poses of, ‘something good might happen’ compels me to put fingers to keyboard.

More precisely though, this thread on Twitter provided the framework for a proper response:

In contrast to Marco’s clear declaration, which reflects my own view, there was this thread, in which Anil Dash presented the idea that, even though there are lots of ‘bad people’ involved in the space, there are also good people and these good people are trying to carve out a good space (for, in this case, artists using NFTs):

One of the most common maladies of our age is a belief – more appropriate in children than adults – that the problem with societies is the presence of ‘bad people’ who, being bad, spend their time, like villians in a Bond movie, imagining bad things to do and not as a function or emergent property of the society itself that enables these bad people. Normally, I’d insert a bit about the materialist analysis of capitalism but I’ll let that go for a later post.

There’s a wealth of evidence that the entire point of the crypto space is the realization of libertarian fantasies – the removal of constraints that protect those who’re considered weak or foolish (no need for deposit guarantees, now there are smart contracts!).

The presence of people with good intentions in this space (whether as software developers, activists or what have you) only serves to provide visual cover for the grift; indeed, this is how grifts function: earnest people are required. You, an earnest person hoping to do good things, suppress your knowledge of all the problems – the fundamental, baked into the cake problems – thinking that your niceness is a tactic for change.

This is an abdication of, as Hannah Arendt put it, your duty to think.

It’s time for us to abandon niceness for a solid and consistent application of principle, openly state who our adversaries are and vigorously resist their propaganda.

The first step is to stop fooling ourselves that our niceness is a tactic.

Cloud Technology: A Quick(ish) Guide for the Left

[About ‘cloud’, you can also read a longer piece I wrote for Logic Magazine and an interview I gave for the Tech Won’t Save Us podcast]


The 7 Dec 2021 Amazon Web Services (or, AWS) ‘outage’ has brought the use of cloud computing generally, and the role of Amazon in the cloud computing market specifically, to the attention of a general, non-technical audience [btw, outage is in single quotes to appease the techies who’ll shout: it’s a global platform, it didn’t go down, there was a regional issue! and so on]

Outage, in the total sense, or not, the event impacted a large number of companies, many of which are global content providers such as Disney and Netflix, services such as Ring and even Amazon’s internal processes that utilize their computational infrastructure.

Before the cloud era, each of these companies might have made large investments in maintaining their own data centers to host the computers, storage and networking equipment required to host a Disney+ or HBOMAX platform. In the second decade of the 2000s (really gaining momentum around 2016) the use of at first, Amazon Web Services and then Microsoft’s Azure and Google’s Cloud Platform offered companies the ability to reduce – or even eliminate – the need to support a large technological infrastructure to fulfill the command and control functions computation provides for capitalist enterprises.

Computation, storage and database – the three building blocks of all complex platforms – are now available as a utility, consumable in a way, not entirely different from the consumption of electricity or water (an imperfect analogy since, depending on the type of cloud service used, more or less technical effort is required to tailor the utility portfolio to an organization’s needs).


What is Cloud Computing? What is it’s Political Economy? What are the Power Dynamics?

Popular Critical Meme from Earlier in the Cloud Era

A full consideration of the technical aspects of cloud computing would make this piece go from short(ish) to a full position paper (a topic addressed in the Logic Magazine essay I mentioned at the top). So, let’s answer the ‘what’ question by referring to what’s considered the urtext within the industry: the NIST definition of cloud computing

Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model is composed of five essential characteristics, three service models, and four deployment models.

https://csrc.nist.gov/publications/detail/sp/800-145/final

The NIST document goes on to define the foundational service types and behaviors:

  • SaaSSoftware as a Service (think Microsoft 365 or any of the other web-based, subscription services that stop working if your credit card is rejected)
  • PaaSPlatform as a Service (popular industry examples are databases such as Amazon’s DynamoDB, Azure SQL or Google Cloud SQL)
  • IaaSInfrastructure as a Service (commonly used to create what are called virtual machines – servers – on a cloud platform instead of within a system hosted by a company in their own data center)
  • On-demand Self Service (which means, instead of having to get on the phone to Amazon saying, ‘hey, can you create a database for me’ you can do it yourself using the tools available on the platform
  • Reserve Pooling – (basically, there are always resources available for you to use – this is a big deal because running out of available resources is a common problem for companies that roll their own systems)
  • Rapid Elasticity – (have you ever connected to a website, maybe for a bank and have it slow to a crawl or become unresponsive? That system is probably stressed by demand beyond its ability to respond. Elasticity is designed to solve this problem and it’s one of the key advantages of cloud platforms)
  • Measured Service – (usage determines cost which is a new development in information technology. Finance geeks – and moi! – call this OPEX or operational expense and you better believe that beyond providing a link I’m not getting into that now)

To provide a nice picture which I’m happy to describe in detail if you want (hit me up on Bluesky) here’s what a cloud architecture looks like (from the AWS reference architecture library):

AWS Content Analysis Reference Architecture

There are a lot of icons and technical terms in that visual which we don’t need to get into now (if you’re curious, here’s a link to the service catalog). The main takeaway is that with a cloud platform – in this case AWS but this is equally true of its competitors – it’s possible to assemble service elements into an architecture that performs a function (or many functions). Before the cloud era, this would have required ordering servers, installing them in data centers, keeping those systems cool and various other maintenance tasks that still occasionally give me nightmares from my glorious past.

Check out this picture of a data center from Wikipedia. I know these spaces very well indeed:

Data Center (from Wikipedia)

And to be clear, just because these reference architectures exist (and can be deployed – or, installed ) that does not mean an organization is restricted to specific designs. There’s a toolbox from which you can pull what you need, designing custom solutions.

So, perhaps now you can understand why Disney, for example, when deciding to build a content delivery platform, chose to create it using a cloud platform – which enables rapid deployment and elastic response instead of creating their own infrastructure which they’d have to manage.

Of course, this comes with a price (and I’m not just talking about cash money).

Computer Power is Power and the Concentration of that Power is Hyper Power

Now we get to the meat of the argument which I’ll bullet point for clarity:

  • Computer power is power (indeed, it is one of the critical command and control elements of modern capitalist activity)
  • The concentration of computer power into fewer hands has both operational and political consequences (the operational consequences were on display during the 8 December AWS outage – yeah, I’m calling it an outage cloud partisans, deal)
  • The political consequences of the concentration of computer power is the creation of critical infrastructure in private hands – a super structure of technical capability that surrounds the power of other elements of capitalist relationships.

To illustrate what I mean, consider this simple diagram which shows how computer capacity has traditionally been distributed:

Note how every company, with its own data center, is a self-contained world of computing power. The cloud era introduces this situation:

Note the common dependency on a service provider. The cloud savvy in the audience will now shout, in near unison: ‘but if organizations follow good architectural principles and distribute their workloads across regions within the same cloud provider for resiliency and fault tolerance (yes, we talk this way) there wouldn’t be an outage!’

What they’re referring to is this:

AWS Global Infrastructure Map Showing (approximate) Data Center Locations

From a purely technical perspective, the possibility of minimizing (or perhaps even avoiding) service disruption by designing an application – for example, a streaming service – to come from a variety of infrastructural locations, while true, entirely misses the point…

Which is that the cloud era represents the shift of a key element of power from a broadly distributed collection of organizations to, increasingly, a few North American cloud providers.

This has broader implications which I explore in greater detail in my Logic Magazine piece.

UPDATE 11 Dec

Amazon has posted an explanation (which, in the industry is known as a root cause analysis) explaining the outage. I’ll be digging into this in detail soon.