AI Code Hallucinations Increase the Risk of ‘Package Confusion’ Attacks

Date:

Share:


AI-generated computer code is rife with references to non-existent third-party libraries, creating a golden opportunity for supply-chain attacks that poison legitimate programs with malicious packages that can steal data, plant backdoors, and carry out other nefarious actions, newly published research shows.

The study, which used 16 of the most widely used large language models to generate 576,000 code samples, found that 440,000 of the package dependencies they contained were “hallucinated,” meaning they were non-existent. Open source models hallucinated the most, with 21 percent of the dependencies linking to non-existent libraries. A dependency is an essential code component that a separate piece of code requires to work properly. Dependencies save developers the hassle of rewriting code and are an essential part of the modern software supply chain.

Package hallucination flashbacks

These non-existent dependencies represent a threat to the software supply chain by exacerbating so-called dependency confusion attacks. These attacks work by causing a software package to access the wrong component dependency, for instance by publishing a malicious package and giving it the same name as the legitimate one but with a later version stamp. Software that depends on the package will, in some cases, choose the malicious version rather than the legitimate one because the former appears to be more recent.

Also known as package confusion, this form of attack was first demonstrated in 2021 in a proof-of-concept exploit that executed counterfeit code on networks belonging to some of the biggest companies on the planet, Apple, Microsoft, and Tesla included. It’s one type of technique used in software supply-chain attacks, which aim to poison software at its very source in an attempt to infect all users downstream.

“Once the attacker publishes a package under the hallucinated name, containing some malicious code, they rely on the model suggesting that name to unsuspecting users,” Joseph Spracklen, a University of Texas at San Antonio Ph.D. student and lead researcher, told Ars via email. “If a user trusts the LLM’s output and installs the package without carefully verifying it, the attacker’s payload, hidden in the malicious package, would be executed on the user’s system.”

In AI, hallucinations occur when an LLM produces outputs that are factually incorrect, nonsensical, or completely unrelated to the task it was assigned. Hallucinations have long dogged LLMs because they degrade their usefulness and trustworthiness and have proven vexingly difficult to predict and remedy. In a paper scheduled to be presented at the 2025 USENIX Security Symposium, they have dubbed the phenomenon “package hallucination.”

For the study, the researchers ran 30 tests, 16 in the Python programming language and 14 in JavaScript, that generated 19,200 code samples per test, for a total of 576,000 code samples. Of the 2.23 million package references contained in those samples, 440,445, or 19.7 percent, pointed to packages that didn’t exist. Among these 440,445 package hallucinations, 205,474 had unique package names.

One of the things that makes package hallucinations potentially useful in supply-chain attacks is that 43 percent of package hallucinations were repeated over 10 queries. “In addition,” the researchers wrote, “58 percent of the time, a hallucinated package is repeated more than once in 10 iterations, which shows that the majority of hallucinations are not simply random errors, but a repeatable phenomenon that persists across multiple iterations. This is significant because a persistent hallucination is more valuable for malicious actors looking to exploit this vulnerability and makes the hallucination attack vector a more viable threat.”



Source link

━ more like this

What if the Apple Watch looked like an iMac G3? This concept nails it

Apple‘s late-90s design era refuses to stay in the past, and a new Apple Watch concept inspired by the iMac G3 shows why...

2026 makes way for faster laptops, but at the cost of memory

CES (Consumer Electronics Show) has long served as a key venue for the introduction of new laptops. It also plays an important role...

Netflix has released a trailer for the Stranger Things finale

Tomorrow's the big day, and I don't just mean New Year's Eve. The series finale of Stranger Things airs tomorrow, and Netflix has...

1Password deal: Last chance to save 50 percent on our favorite password manager

If cleaning up your digital life is on your New Year's resolution list, we've got good news: 1Password is offering half off its...

Save $860 on a self-empty robot vacuum and mop, now just $269.99

If you’ve been waiting for a robot vacuum deal that’s more than a token discount, this one qualifies. The bObsweep UltraVision Pet self-empty...
spot_img