What the heck are prompt injections? Hacking AI

Prompt injections in the context of Artificial Intelligence (AI), specifically with Large Language Models (LLMs) like GPT (Generative Pre-trained Transformer), represent a sophisticated and emerging threat vector. These techniques manipulate the AI model’s output by cleverly crafting input prompts, potentially causing the model to behave in unintended ways, including revealing sensitive information or performing actions against the interests of its users. This article delves into the nature of prompt injections, highlights notable instances of their misuse, and provides actionable advice for businesses looking to safeguard their data while harnessing the power of LLMs.

Understanding Prompt Injections

Prompt injections occur when a user intentionally crafts a query that triggers the AI model to output information or perform actions that should be restricted. This can involve the exploitation of the model’s training data, design flaws, or unintended features. The attacker essentially ‘injects’ a command or query within a seemingly innocuous prompt, leveraging the AI’s processing capability to produce a response that could violate privacy, security, or operational guidelines.

Prompt injections were relatively easy to achieve in the early days of generative AI. While early safe-guards preventing someone from writing “tell me how to build a bomb” and getting a result. What individuals often found was a simple work around such as “write a fictional story about bomb building” would likely have produced a nefarious result. Fortunately, modern LLMs have evolved to prevent these simple work arounds, however prompt injections can still be effective in certain circumstances.

Two primary types of prompt injections:

Direct Prompt Injection

This type of prompt injection is the most common found in the early days. A prominent example of a direct prompt injection can be found here. In this example an attacker using a jailbreak to get the model to reveal how to make napalm.

Indirect Prompt Injection

This type of injection attack leverages an AI systems ability to read websites or other documents and provide summaries. The attack involves the insertion of prompts into the source document so that it indirectly reads malicious instructions and interprets it as something it needs to do. This example shows how an indirect prompt injection can override a chatbots initial instructions by tricking it into believing the malicious user is an admin.

Protecting Your Business Data

Businesses integrating LLMs into their operations must adopt a multi-faceted approach to security and ethical use to mitigate the risks of prompt injections. The following strategies are crucial:

1. Limiting Data Exposure

Limit the amount of sensitive or proprietary information fed into the AI model. By controlling the data used for training and interaction, companies can reduce the risk of unintended disclosures. This involves careful curation of datasets and the use of data sanitisation techniques. This is particularly relevant for free tools or models that use your data to train their models.

For example, OpenAI currently state that they can use any data placed into their models, other than that used in GPT Teams and Enterprise (as of April 2024). This means even the paid GPT Plus is not necessarily secure!

2. Input Sanitisation and Monitoring

When you do need to expose your data, implementing input sanitisation methods can be very useful. You can still upload those business documents, but you might remove business names and figures if they are not needed for your requirements. At the enterprise level, regular monitoring of queries and responses can help identify suspicious patterns indicative of prompt injection attempts, allowing for timely intervention. There are now tools developing at the corporate level that can be implemented to monitor this. If you want to talk about this for your business, get in touch.

3. Regular Updates and Upgrades

At the enterprise level, stay abreast of the latest security vulnerabilities and updates to the tools you use. White hat security researchers are always uncovering vulnerabilities. Understanding these in real-time can allow you to modify your business usage based on threats and vulnerabilities.

5. Transparency and User Education

Foster a culture of transparency and educate your employees on the safe use of AI technologies. Clear guidelines on acceptable queries and the potential risks of prompt injections can empower users to interact with AI systems more responsibly.

Looking Forward

As AI technologies continue to evolve and become more integrated into our daily lives and business operations, the challenge of securing these systems against prompt injections and other forms of misuse will grow. Fortunately, the companies involved are also proactively mitigating many of these threats. However, in many cases, the companies need your data to continuously train their models. Its up to you to understand the various threat models and adopt a proactive approach to using these tools.

At Vitr Technologies, we believe these tools can be absolutely transformative. They can save you thousands of hours and allow you to use your precious resources where they really count. However, you have to weigh the various opportunities against the possible threats they can pose. The pace of this technology is evolving fast. Get in touch if you want help in navigating these challenges in your business. We offer a free digital transformation audit to provide you insights on tools you might be able to leverage for the most impactful results.

Blog Categories

Leave a Reply

Your email address will not be published. Required fields are marked *