ChatGPT and other LLMs have taken the world by storm – ChatGPT itself is the fastest software product to 100 million users ever.
The capabilities of this technology can be astonishing – creating poetry, writing code, generating images – there are clearly huge benefits.
But at the same time, we’re seeing an online backlash, particularly about privacy, with many scaremongering articles suggesting that sharing data with AI tools is very high risk.
For any business leader, these are not just technical questions; they are critical business questions. The good news is that you have far more control over your data than the scaremongering articles suggest.
Let’s take a deeper look.
The Fundamentals
What happens when we type some information (a prompt) into ChatGPT?
The prompt – and any attachments – are sent off to OpenAI’s network of GPUs – specialised processors designed to run the AI model using the prompt as input.
Internally, it runs a vast number – hundreds of millions – of mathematical calculations which take the input data and use it to generate the output which appears on our screens.
By default, this process doesn’t store the prompt data at all – it just uses it.
But OpenAI can modify this process. First, in chat applications, it will store the prompt and include it in subsequent prompts so that the model ‘remembers’ what has previously been said. Second, OpenAI can store the prompt and output and use this to help improve the model, updating the internal “weights” that define the model configuration.
This second step could, potentially, lead to a privacy issue – if the original prompt says “Fred Bloggs has Cancer” then this new fact could be added to the model, and if someone else asks “Does Fred Bloggs have an illness” it could potentially appear on the output.
By default, consumer versions of tools like ChatGPT are set up to use both steps. However, for business use, the rules are very different, and the controls are much stronger.
OpenAI Protections
It is not in the interest of major AI providers for their models to leak confidential data; their reputation and business model depend on trust. For example, if someone found that OpenAI was routinely capturing and leaking information from prompts, its market would be severely damaged.
Looking at the OpenAI privacy policy - https://openai.com/en-GB/policies/row-privacy-policy/ - we see that they explicitly say they will use personal information to improve the service, and they reference a second article on how prompts can be used to enhance the model - https://help.openai.com/en/articles/5722486-how-your-data-is-used-to-improve-model-performance.
This sounds worrying. But remember OpenAI doesn’t want user information disclosed – so most likely it has designed the training process to try to exclude such information. OpenAI can still get huge value from using millions of daily AI conversations to train other parts of the model – the English language for example.
As a result, they provide clear and robust controls for businesses.
- For Individuals: Most services now offer a simple privacy setting to "opt-out" of having your content used for training.
- For Businesses (The Important Part): For professional users, the protection is built-in. Major providers explicitly state that they do not train on data submitted via their business products. This includes:
- API Access: When we build a custom application for you that calls an AI model via its API, your data is not used for training by default.
- Business Tiers: Services like ChatGPT Team and ChatGPT Enterprise are designed for organisational use and have data training turned off by default.
This single fact cuts through most of the FUD. Using the commercial API or a business-tier account is the first and most effective layer of protection.
Choosing Your Deployment Model
| Deployment Model | Trust Factor | |
|---|---|---|
| Public APIs | Access provider’s model via secure API | Trust provider’s privacy policy |
| Hosted Open-Source | Public model (e.g., Llama) on cloud platform (AWS, Azure) | Trust hosting provider, not model creator |
| Private Cloud Deployment | Dedicated server instance in cloud for your organisation | Highest security, trust cloud provider |
| Local LLMs | Model runs entirely on your own hardware | Complete control, highest security/cost |
Deployment Models
Beyond simply using a business account, you can choose how your AI model is deployed. Think of it as a spectrum offering increasing levels of control, security, and cost.
Level 1: Public APIs (e.g., OpenAI, Anthropic, Google Gemini):
- How it works: You access the provider's model via their secure, commercial API.
- Trust factor: You trust the provider's privacy policy that they won't train on your business data. This is the standard for most applications.
Level 2: Hosted Open-Source Models (e.g., Llama on AWS or Azure):
- How it works: You use a public, open-source model (like Meta's Llama) but run it on a major cloud platform like Amazon Web Services (AWS), Google Cloud, or Microsoft Azure.
- Trust factor: This creates a powerful separation of concerns. The hosting provider (Amazon) has no incentive to send your prompts back to the model's creator (Meta). You are trusting a different contractual agreement, often one your business already has in place.
Level 3: Private Cloud Deployment:
- How it works: We configure a dedicated server instance within a cloud platform (like AWS or Azure) that runs an AI model exclusively for your organisation.
- Trust factor: This eliminates the risk of sharing processing resources with other companies. The only remaining risk is a malicious act from the cloud provider itself, which is extremely unlikely and would be a major breach of trust and law. This is a very secure but expensive option.
Level 4: Local LLMs:
- How it works: For ultimate security, the model runs entirely on your own hardware. Your data never leaves your network.
- Trust factor: You are in complete control. This is suitable for organisations with extreme security requirements, but it comes with significant hardware and maintenance overhead.
A Practical Comparison of Major AI Providers
To help you assess the landscape, here is a breakdown of the major providers. As a business, you should always refer to the specific terms for the API or Enterprise service, not the free consumer tool.
| Company | Headquarters | Default Policy for API/Business Data | Opt-Out Available? |
| OpenAI | United States | Not used for training. | N/A (Opt-out is for consumers; business data is excluded by default). |
| Microsoft Azure AI | United States | Not used for training. Your data is processed only for the response. | N/A (Business data is excluded by default). |
| Google Cloud AI | United States | Not used for training without explicit permission. | N/A (Business data is excluded by default). |
| Anthropic | United States | Not used for training. | N/A (Opt-out is for consumers; API/business data is excluded by default). |
| Amazon Bedrock | United States | Not used for training by Amazon or third-party model providers on the platform. | N/A (Data is excluded by default). |
| DeepSeek | China | Policy may be less clear and subject to different legal frameworks. Requires careful review. | Requires careful review. |
| Cursor | United States | Data is used for training, but they offer a "Privacy Mode" to prevent this. | Yes, via "Privacy Mode." |
Don't Confuse Data Privacy with AI Accuracy
It's important not to confuse two separate trust issues. The first, which we've covered, is data privacy. The second is model accuracy.
LLMs can "hallucinate"— confidently state incorrect information. This erodes a user's general trust in the technology. While providers are working hard to fix this, it's a separate technical challenge from data security. It is, however, another key reason why AI implementation requires expert handling. A well-designed system doesn't just call an AI; it has checks and balances, like using Retrieval-Augmented Generation (RAG) to ground answers in your company's real documents, preventing hallucinations and building user trust.
Your Practical Risk-Mitigation Checklist
The risks associated with AI data privacy are real, but they are manageable and, we believe, often overstated. To protect your business, follow these steps:
- Always Use Business Tiers: Never use free, consumer-grade AI tools for confidential company work. Mandate the use of API access or Enterprise accounts.
- Use a Recognised Provider: Stick to established providers from jurisdictions with strong data protection laws, like the US, UK, and EU. Avoid providers with unclear policies or those based in regions with weaker legal safeguards.
- Consider Data Residency (GDPR): If using personal data, be aware of where your provider is located. Using a provider in an approved country (like the USA, under the UK-US Data Bridge framework) or hosting a model in a UK/EU data centre is essential.
- Partner with an Expert: The AI landscape is complex and changes weekly. Working with a specialist partner like Blueberry AI ensures you are making informed, secure, and commercially sound decisions.
We don't just build AI systems; we build trusted, robust, and practical solutions designed for the real world.
Recent AI Posts
You’ve decided that using AI will be useful to your business. Now you face a critical and confusing decision: which Large Language Model (LLM) should power your project? In a landscape dominated by names like ChatGPT, Claude, and Gemini, choosing the right engine is crucial for success. Selecting the wrong one can lead to budget overruns, poor performance, or a solution that simply doesn’t meet your needs.
The technical choice is actually a strategic business decision. The guide below provides a clear comparison, focusing on the practical differences that matter most to your project’s outcome and its ROI. Models evolve quickly, so think of the examples here as representative patterns rather than a definitive “league table”.
In the race to adopt AI, it’s easy to focus on what a Large Language Model (LLM) can do. But for your business, your users, and your bottom line, the question of how fast and reliably it can do it is just as critical.
With the rise of Agentic AI—where models navigate systems, write code, and execute complex workflows—performance has become the single most significant factor in a successful implementation.
Poor performance can frustrate users, cripple productivity, and turn a promising AI tool into a frustrating bottleneck. This guide cuts through the noise to give you a practical understanding of what performance really means, with real-world benchmarks and a look at the trade-offs you need to consider.
With so many large language models (LLMs) available, selecting the right one depends on your specific needs. Whether you're coding, analysing documents, working within a team, or managing costs, each model offers unique strengths. Here's a quick guide to help you decide which LLM best fits your use case.
We're Easy to Talk to - Let's Talk
CONTACT USDon't worry if you don't know about the technical stuff or exactly how AI will help your business. We will happily discuss your ideas and advise you.