In September 2022, hackers used prompt injection attacks to attack recruitment company Remoteli.io’s Twitter chatbot deployed using GPT-3…


The hackers added dangerous inputs to the bot’s programming, leading it to reveal the original instructions it had been fed. It ended up generating inappropriate replies about “remote work,” resulting in damaging the reputation of the start-up, besides exposing a boatload of legal risks. This stark lesson is also otherwise known as data poisoning, a.k.a. AI poisoning.

It’s a kind of cyberattack where hackers attack training datasets of AI (artificial intelligence) and ML (machine learning) models by introducing misleading information, modifying existing data, or deleting important data points. The goal? To mislead the AI model into making incorrect decisions or predictions, leading to poor model generalisation. With October being Cybersecurity Awareness Month, let’s delve into the world of data poisoning and explore what makes it both dangerous and relevant in today’s scenario.

How Do Data Poisoning Attacks Work?

As mentioned earlier, the goal of data poisoning is to overtly or subtly undermine the reliability and accuracy of AI models. In fact, it isn’t limited to just traditional AI systems, but also applies to RAG models (retrieval-augmented generation models), where the model’s uses real-world data to improve its responses. These adversarial AI cyberattacks wreak havoc in industries which are dependent on AI-driven decisions, such as healthcare, finance, and even autonomous systems, due to model misbehaviour.

Broadly speaking, data poisoning attacks can be slotted into two categories: direct and indirect attacks. As much as it might not sound like that, direct or targeted attacks maintain the overall model performance and capabilities and only manipulate the ML to behave in a particular way for specific inputs. For instance, hackers could inject subtly altered images (adding accessories, changing hair colour) of specific people into training datasets of facial recognition systems that are trained to identify individuals based on their images.

On the other hand, the goal of indirect a.k.a. non-targeted attacks is to degrade the AI model’s overall performance by injecting irrelevant data or random noise into the training set, thus impairing the model’s ability to draw inference from its training data. For instance, hackers could introduce large volumes of irrelevant mails into spam detection systems trained on email datasets labelled as spam/spam. This influx could confuse the model, result in a higher rate of false positives/negatives, and ultimately reduce its effectiveness.

There are numerous ways in which data poisoning attacks can take place. For instance, hackers can use backdoor attacks to embed hidden triggers imperceptible to the human eye in the training dataset. The model behaves in a specific way when it encounters these embedded triggers, which could result in manipulated outputs and security breaches. Then, there are data injection attacks, where the goal is to manipulate the model behaviour during deployment by adding malicious samples to the dataset.

Think of it as attackers hacking a banking model to discriminate against specific demographics during processing loans, leading to loss of reputation and legal issues.

Mislabelling attacks, on the other hand, see hackers modifying datasets by assigning incorrect labels to part of the training data. Since it learns from this corrupted data, the model becomes less accurate, leading to it becoming useless and unreliable. Data manipulation attacks see hackers alter existing data within the training dataset by either injecting adversarial samples or adding incorrect data to skew results. Such attacks end up severely downgrading the performance of AI/ML models.

Why Data Poisoning Is Concerning For Enterprises

With an increasing number of industries and enterprises adopting LLMs (Large Language Models) and GenAI (Generative AI) tools, cybercriminals are having a field day exploiting the open-source nature of these AI datasets. They even develop innovative attack methods using tools from the dark web for malicious use, even scaling and automating their attacks.

What’s surprising is that hackers only need to alter a tiny amount of data to render these algorithms ineffective. Since data poisoning attacks could occur subtly over time, as we’ve seen above, they’re becoming increasingly challenging to identify until there’s already been significant damage. Attacks can also be ongoing, with hackers gradually introducing noise or altering datasets, often operating without their actions becoming immediately visible.

These could be severely detrimental to sectors such as healthcare and finance, which has far-reaching consequences on individuals. Data poisoning attacks could end up skewing diagnostic models, potentially leading to inappropriate treatment recommendations or even misdiagnosis and life-threatening decisions. Likewise, in the financial sector, approval of fraudulent transactions and creation of false profiles could severely undermine the integrity of financial systems, and even destroy lives.

What Lies Ahead?

According to Infosecurity Magazine, nearly a quarter of UK and the U.S. organisations have already faced AI data poisoning attacks and intrusions by September 2025. Clearly, the occurrence is higher than what everyone thought or even expected. With data poisoning attacks resolutely set to not just undermine technical systems but threaten the integrity of the services that the masses rely on, we require stronger governance to protect both the public and business industries.

A multifaceted approach of fostering cybersecurity awareness among employees, employing robust model training techniques, continuously monitoring data inputs for anomalies, and ensuring data integrity via strict governance practices is how we build resilience and defend AI systems in the future.

In case you missed:

Malavika Madgula is a writer and coffee lover from Mumbai, India, with a post-graduate degree in finance and an interest in the world. She can usually be found reading dystopian fiction cover to cover. Currently, she works as a travel content writer and hopes to write her own dystopian novel one day.

Leave A Reply

Share.
© Copyright Sify Technologies Ltd, 1998-2022. All rights reserved