Wednesday, June 11, 2025

How jagged AI botches research: an in-depth example of artificial jagged intelligence at work

Annotated a screenshot of a Google AI-enhanced search result that is seriously incorrect
Annotated screenshot of Google AI making errors describing the Vienna computer virus

During a recent research project at the intersection of artificial intelligence (AI) and cybersecurity, I had occasion to refresh my memory about a computer virus from the 1980s known as the Vienna virus. So I put the words vienna and virus into Google Search. At first glance the result delivered by Google's AI Overview feature looked quite impressive. This is not surprising because this feature, hereinafter referred to as GAIO, is powered by Google's Gemini Language Model, one of the most expensive AI models ever built, with costs rivaling Open AI's GPT-4.

Sadly, the information about the Vienna virus that GAIO so confidently laid out was both laughably inaccurate and seriously troubling (as I explain in depth below). Whether you call this hallucinating or just plain "getting it wrong," it is important to know that today's AI can tell you things that aren't true, but in ways that make it seem like they are true.

Welcome to the rough and unready world of Artificial Jagged Intelligence

To be clear, millions of people and organizations are, right now, in 2025, using a technology that has been widely praised and promoted as exhibiting intelligence, yet keeps making dumb errors, the kind that in real life would be attributed to a serious lack of intelligence. Some of these errors have been trivialized as hallucinations because they mix up pieces of information that are real but combine them in a way that produces false information (see my 2024 LinkedIn article: Is your AI lying or just hallucinating?).

I find it both weird and troubling that currently many billions of dollars are being spent to market and deploy this flawed AI technology. You would think persistent errors and hallucinations would give the leading commercial entities behind AI cause to pause. But no, they keep marching onward in the hope of progress. However, they do have a new term for this state of affairs: Artificial Jagged Intelligence. 

AI leaders have a new term [jagged] for the fact that their models are not always so intelligent

That's right, Google's billionaire CEO, Sundar Pichai, recently used the term "artificial jagged intelligence or AJI" to describe the current state of AI, saying: "...you can trivially find they make errors or counting R's in strawberry or something, which seems to trip up most models...I feel like we are in the AJI phase where [there's] dramatic progress, some things don't work well, but overall, you're seeing lots of progress."

(I find it weirdly refreshing yet deeply scary that the billionaire CEO of a trillion-dollar company said that about a technology which he and his employer appear to be pushing into homes and businesses as fast as they can.) 

Getting back to the jagged AI response to my simple search query about the Vienna virus, I decided to investigate how it came about. Fortunately, I am my own employer and can afford to treat my interactions with AI as experiments. In this case the experiment became: Determine the extent to which GAIO understands the history and concepts of malicious code, and explore why it get things wrong? 

Here's the short version of what follows: 

  • Google's AI Overview is an example of Artificial Jagged Intelligence or AJI, which sometimes responds to user queries with information that is incorrect.
  • LLMs like ChatGPT and DeepSeek, also exhibit this behaviour and I give links to examples.
  • AIs may not check whether the facts they present are infeasible, even though they have been trained on data by which such infeasibility could be determined.
  • Some AIs, like GAIO and ChatGPT, don't seem to ingest corrections (errors pointed out by users may be acknowledged by the AI, but nevertheless repeated in the future). 
  • GAIO seems to use sketchy source weighting that gives more credence to content on some websites than others.
  • This seems to be true of other widely used AIs.

Bottomline: It would be foolish to publish or repeat anything that the current generation of Artificial Jagged Intelligence systems tell you unless you have verified that it is accurate, fair, and true. Such a heavy risk/reward ratio casts doubt on the value of this technology. (See: Trump administration's MAHA Report AI Fiasco.)

Where's the Intelligence in this jagged AI?

The annotated screenshot at the top of this article shows what Google's AI Overview said about the Vienna virus back in April (n.b. in this article the term virus refers exclusively to viral computer code). If you are familiar with the history of malicious code you may guffaw when you read it. Here's why:

  • If the Vienna virus was found in 1987 it could not have been one of the first macro viruses beecause in 1987 macros were not capable of being viral. 
  • The 1995 Concept virus is generally considered to be the first macro virus. 
  • The Vienna virus did not display a "crude drawing of Michelangelo's David".  
  • I can find no record of any virus creating a "crude drawing of Michelangelo's David.
  • There was a boot sector virus called Michelangelo that appeared in 1991, but it had nothing to do with the artist and got its name from the fact that it activated on March 6, which just happens to be the artist's birthday.

There is more bad news: GAIO's response when asked about the Vienna virus on June 1, nearly two months after the erroneous results in April, was just as erroneous: 

Screenshot of AI output that contains errors

Clearly, GAIO is not getting more knowledgeable over time. This is troubling because Google's Gemini, the AI behind Google AI Overview, does appear to have an accurate understanding of Vienna and knows that it is notable in the history of cybersecurity, as you can see in this exchange on June 1:
Screenshot of accurate AI output

At this point you might be wondering why I asked AI about the Vienna. Well, technically, I didn't. I started out just doing a search in Google to refresh my memory of this particular piece of malicious code before I mentioned it in something I was writing (pro tip: don't ever publish anything about malicious code without first doing a fact-check; malware experts can be merciless when they see errors).

In responding to my search query, it was Google's idea to present the AI Overview information, produced with the help of it's incredibly expensive and highly resource intensive Gemini AI. The fact that it was so obviously wrong bothered me and I felt the need to share that upon which I had stumbled. Because I tend to see life as a series of experiments, when actions that I take lead to errors, problems, or failures, I try to learn from them.
 

Applied learning and cybersecurity


When Google gave me these problematic errors, I knew right away that I could use this learning in my AI-related cybersecurity classes. (These have become a thing over the past five years as I have researched various aspects of AI from a perspective informed by my cybersecurity knowledge which has been gradually accumulating since the 1980s.)

In the process of teaching and talking about cybersecurity and cybercrime in the 2020s I have realized that many students don't know a lot about the history of malicious digital technology and this can seriously undermine their efforts to assess the risks posed by any new technology, including AI. 

For example, if you know something about the history of computer viruses, worms, Trojans and other malicious code, you will have an idea of the lengths to which some people will go to deceive, damage, disrupt, and abuse computers and the data they process. Furthermore, you will appreciate how incredibly difficult it is to foil aggressively malicious code and the people who spend time developing it.

Fortunately, I know a thing or two about the history of computer viruses and other forms of malicious code (collectively malware), as well as the antivirus products designed to thwart them. This is not just because I started writing about them back in the 1980s. As it happens, the best corporate job I ever had was working at ESET, one of the oldest antivirus firms and now Europe's largest privately held cybersecurity company. (Disclaimer: I have no financial connections to ESET and no financial incentive to say nice things about the company.)

Working at ESET from 2011 to 2019 I had the privilege of collaborating with a lot of brilliant people, one of whom, Aryeh Goretsky, was the first person that John McAfee hired, way back in 1989. Aryeh has since become a walking encyclopedia of antivirus lore and helped me with some of the details of Vienna here (but any errors in what I've written here are entirely mine). 

Back in the 1980s, there were probably less than two dozen computer viruses "in the wild" — the industry term for malicious code seen outside of a contained/managed environment. However, some of these viruses in the wild were very destructive and efforts to create tools to defend computers against them—such as the software that became known as McAfee Antivirus—were only just gearing up. 

One such effort had begun in 1987 in the city of Bratislava in what was then the Czechoslovak Socialist Republic, a satellite state of the Soviet Union. That's where two young programmers, Miroslav Trnka and Peter Pasko, encountered a computer virus that was dubbed "Vienna" because that is where people thought it originated.

There is considerable irony in the fact that an AI today can spout nonsense about a virus found back then, because Trnka and Pasko went on to create a company that did important early work with proto-AI technology, for reasons I will now explain. 

The Actual Vienna Virus


What the actual Vienna virus did was infect files on MS-DOS PCs (personal computers which ran the Microsoft Disk Operating System). Specifically, it infected program files that had the .COM. filename extension. Here is a technical description from a relatively reliable source and as you can see it differs considerably from Google's flawed AI Overviews:
Vienna is a non-resident, direct-action .com infector. When a file infected with the virus is run, it searches for .com files on the system and infects one of them. The seconds on the infected file's timestamp will read "62", an impossible value, making them easy to find. One of six to eight of the files will be destroyed when Vienna tries to infect them by overwriting the first five bytes with the hex character string "EAF0FF00F0", instructions that will cause a warm reboot when the program is run. — Virus Encyclopedia
When the programmers Trnka and Pasko encountered this very compact yet destructive piece of viral code, they took a stab writing a program that could detect the code and thus alert users. And when Trnka and Pasko achieved a working version they shared it with friends. They called it NOD, which stands for: "Nemocnica na Okraji Disku ("Hospital at the end of the disk"), a pun related to the Czechoslovak medical drama series Nemocnice na kraji mÄ›sta (Hospital at the End of the City) —Wikipedia

(To me, this name reflects the ethos of many early anti-virus researchers who felt that protecting computer systems was more like healthcare for IT than just another opportunity to make money off human weaknesses.) 

List of Vienna virus variants
When new viruses appeared in the wild, the NOD software was updated, but the effort required to do this kept increasing as more virus code appeared in the wild. Some of that new code was variations of earlier code and Trnka and Pasko could see that attempting to identify viruses purely by comparing all new executable code to a growing database of known malicious code would not be a sustainable long-term strategy.

Indeed, if Google's AI was really clever, it would have noted that the proliferation of virus variants is one of the most notable facts about the Vienna virus. The list on the right shows some of the dozens of variants of Vienna that were discovered in the years after it first appeared. I think I'm right in saying that there are two main reasons for this: 
  1. The original Vienna virus was a relatively simple piece of code; and
  2. In 1988 that code was made public, notably being published in a book. "Unfortunately the source code to this virus has been published in a book: Computer viruses: A High-Tech Disease which has resulted in multiple variants of the virus." — F-Secure virus directory
Getting back to the birth of the NOD antivirus software, in the late 1980s it was clear that antivirus programs could have significant commercial value, but back then the state of Czechoslovakia was not open to private enterprise because it was a satellite state of the Soviet Union. 

Fortunately, by the end of 1992, the independent republics of Czech and Slovakia had come into existence and the makers of NOD created a Slovakian company called ESET, to market their antivirus as a commercial product. (ESET is the Czech word for Isis, the Egyptian goddess of health, marriage and love, reinforcing the idea that antivirus software is intended to keep computers healthy.)

By this time it was clear to the programmers and data scientists at ESET that their heuristic approach to identifying and blocking malware was the way to go, e.g. identifying unknown or previously unseen malware by analyzing code behavior, structure, or patterns.

As the 1990s rolled on and new forms of computer viruses, worms, and Trojan code appeared — such as the macro viruses mentioned earlier — ESET experimented with machine learning and then deep learning with neural networks to implement this heuristic approach to malware detection and response.

What's worse than being wrong? Not knowing why

Naturally, I learned a lot about the benefits and pitfalls of these foundational elements of artificial intelligence during my time as a researcher at ESET. I was fortunate to interact on a regular basis with some brilliant minds working on these AI-versus-malware experiments. I recall one particular presentation about seven or eight years ago that described a neural network achieving an almost perfect result when tasked with finding instances of malicious code hidden within a massive collection of mainly legitimate code.

I say 'almost perfect' because even though 100% of the malware was successfully identified — a very impressive result — there was one very troubling false positive, a piece of legitimate code falsely flagged as malicious. Bear in mind that 100% detection with zero false positives is the holy grail of malware detection, and this test came tantalizingly close. However, the data scientist presenting these results described them as disappointing and deeply troubling because nobody could figure out why the system deemed that particular piece of good code to be bad.

That was my first exposure to the twin problems that have been called Interpretability and Explainability: the ability to understand how an AI model makes decisions (interpretability), and the capacity to provide human-understandable explanations for a model's output, even if the model's inner workings are not transparent (explainability). 

Eight years on from that memorable talk, the sorry saga of the Vienna virus proves that these two problems — together with a third: reproducibility — still plague some of the most widely used AI models, systems that cost hundreds of millions of dollars to build and maintain. The reality is that today's most widely used form of AI is seriously flawed.

Guessing the Root of an LLM GPT Problem

My best guess as to why the AI feature integrated into Google Search (GAIO) jaggedly spouted nonsense about the Vienna virus goes like this:

  1. It is optimized for speed so it responds with the first 'hit' that it gets on the search topic IF that hit is confirmed by a second source.
  2. It uses a constrained list of ranked sources that leans on platform reputation.
  3. It doesn't refer to past interactions about the search topic.
  4. It doesn't perform adequate logic checks on its response.

In the case of the Vienna virus, I think the first thing GAIO found was an error-filled article on LinkedIn. I am not going to name the person who wrote the article but here is what it said: 

"The Vienna virus, was a computer worm that originated in Vienna, Austria and is considered one of the first macro viruses. It was spread via Microsoft Word documents via floppy disk. The virus would infect the document template, then replicate itself by creating new copies of infected documents on any floppy disks inserted into an infected machine."

Sounds familiar, right? And the source looks very credible, as you can see here:

Screenshot of a LinkedIn article that contains errors

As for a second source to confirm the first source, that was easy to find because much of the incorrect information from the LinkedIn article was repeated in an article titled "Viruses of the 80s" on a university website in July of 2024 (perdue.edu). Again, I'm not going to name the author, but they wrote, in part: "Originating in Vienna, Austria this virus spread by way of Microsoft Word documents via floppy disks." In other words, this is the Word macro error all over again. 

Was this plagiarism? Hard to say. But given the date, it is possible that the 2024 article is based on AI-generated output that parrots the 2022 LinkedIn article. And because GAIO assumes factual validity without topic-based reasoning, errors that are obvious to humans can get compounded.

All of which raises serious questions about any serious use of AI, the large, publicly available models of which are clearly not to be trusted. Relying on them in any aspect of business or service delivery is asking for trouble unless it is within a comprehensove risk management framework that includes humans in the loop.

We saw this writ large in the Trump administration's Make America Healthy Again report, which appears to have relied heavily on AI without adequate human-in-the-loop risk management (see RFK Jr.’s Disastrous MAHA Report Seems to Have Been Written Using AI). This hugely embarrassing — and very public — AI-riddled publication exposed the issue of "hallucinated" references for the whole world to see. 

(As noted earlier, when I encountered the citation issue in my own work in 2024 I documented it on LinkedIn, where it was seen by a significantly smaller audience that the whole world.)

I have also documented examples of popular AIs getting facts wrong even after when they have been corrected. You can see the video version of this on YouTube

Hopefully, these examples will help people better understand the limitations of current AIs and why they must only be used with great care.