Future of AIAI

External AI, its underlying risks, and how businesses can prepare

By Vaidotas Sedys, Head of Risk, Oxylabs

As AI’s presence in business operations continues to grow, companies aren’t aware of the risks of using open third-party AI models. The most obvious risk is data leakage, whereby private internal data is fed into a public AI model. Similarly, another risk is where systems and datasets are corrupted by malicious data flowing into a company’s network.  

An inability to address such risks will leave companies in danger of financial losses and regulatory penalties, with a likelihood of reputational damage and wavering customer loyalty.  By following a clear two-part internal AI strategy, businesses can utilize generative AI whilst remaining cautious and protecting the company’s data and reputation.   

The rising wave of AI use by employees 

75% of workers are using Gen AI  tools daily. With venture capital investment into AI companies being driven up by over 70% in Q1 of 2025there are no signs that this is going to taper off. Further investments will intensify competition among AI companies, resulting in enhanced functionality and lower costs.  

While businesses benefit from rising efficiency, risks loom. 80% of employees using genAI for work purposes are introducing their preferred AI tool into their work by themselves. This means most are relying on open-source third-party models and tools like ChatGPT. IT teams may not be aware of which tools are being used, how, and the risks at play.  

The dangers  

Third-party AI models pose many risks. Take intellectual property, for example. By using external models that may have been trained on unlicensed data, companies could expose themselves to legal challenges.  

Reliance on the biggest players in generative AI is another factor to consider. If a company’s IT structure wholly relies on external AI software, there is a chance it could be forced to accept whatever pricing is offered.   

In the same vein, the business will need to reconfigure its structure when these AI providers change their APIs, and it will always be dependent on the quality of their incident response and technical support. 

Whilst the above risks are significant, two particular dangers warrant deeper investigation: the leaking of sensitive data to open-source AI models, and the infiltration of internal networks with inaccurate or malicious data from these models.  

Internal Data Leaks: A hidden cost of AI adoption 

When AI is used in professional environments, there is a risk of internal data leakage. This could include private client information.  

Open AI models train on the information provided by the user.  Effectively, the data given by one user can then be received by another user to address a question that has been asked.  

Thus, sensitive internal data used in prompts for these models could end up being shared publicly.  

Moreover, hackers are using a technique called model inversion to retrieve sensitive information by reverse engineering AI models. Hackers use input patterns to work out what dataset the model has been trained on. By analyzing the response of AI models, they are then able to reconstruct portions of the original training data.  

Third-party models and polluting data 

The damage third-party models can have on a business’s own data stream can prove vastly significant. The first form of this risk is a fairly obvious one – poor quality or inaccurate data.  By some estimates, more than 80% of AI projects fail due to the tools used – twice the rate of failure for information technology projects that do not involve AI.   

AI hallucination – the invention of incorrect or fake data – is a continual risk, and is believed to be an inevitable part of the way large language models work. So, if employees rely on poor-quality 3rd party models and then fail to double-check their output, there is a risk that inaccurate or fake data will be incorporated into datasets and models within a company’s network. This can lead to incorrect outputs and thus poor quality decision-making. 

Malicious or manipulated data is yet another risk businesses face. The notion of AI Jacking is a form of AI supply chain attack where bad actors register models or datasets on AI platforms such as Hugging Face, and then insert dangerous data into them. 

Hackers exploit renamed models and datasets on AI platforms, frequently swapping in their own corrupt version. Subsequently, when an organization unknowingly relies on these altered datasets, its projects and algorithms end up pulling malicious data controlled by the hacker instead.  

Businesses also risk ‘backdoor attacks’, where hidden vulnerabilities lie dormant until triggered, potentially exposing personal data. This can compromise a company’s data because the model will continue to appear normal until ‘triggered’ – usually by the attacker’s instigation.  

Another hurdle to be aware of is ‘data poisoning’ – way to corrupt AI behaviour and to elicit specific outputs. This involves infiltrating a model by injecting corrupted or malicious data into its training datasets. This could be done in the form of mislabeling specific portions of a dataset or targeting specially crafted data samples. 

A dual mindset: Employee education and implementing internal tools  

There are two clear and actionable steps businesses should take to mitigate the risks of external AI on company processes.  

Firstly, senior leadership teams and company directors must ensure that all employees use corporate AI tools for work-related tasks. This is largely because these tools are confined within strict network boundaries (rather than being widely accessible). This ensures sensitive internal data stays in-house and limits the potential for malign data to enter a network. 

Moreover, corporate-specific generative AI tools will be delivered by an AI provider that secures the back-end. However, companies still need to ensure good data governance, as they remain responsible for the front-end – handling data and overseeing how the AI tool is used. 

Companies need to remain perceptive to employee behaviour when it comes to implementing generative AI. Employees are likely to form habits quickly – developing preferences for particular tools and ways of working. As this happens, education must remain prevalent. This can support employees and foster safe adoption of gen AI. 

AI providers can assist in this process as they are likely to build in controls for corporate AI systems. Here, businesses can train their employees on how to use such controls and ensure they are being used coherently. 

Similarly, companies should consider deploying internet moderation systems that redirect employees away from third-party gen AI platforms, instead encouraging them to interact with internal tools. 

Moreover, education is the key to mitigating risk and fostering informed AI use. Speaking realistically, many risks associated with third-party AI models are not discussed, and we cannot expect employees to be wholly aware.  

Businesses must explain the faults and limiting factors of external AI in depth to all employees, providing real examples of how these dangers could impact business processes. After such measures have been highlighted, employees will likely be more aware and willing to use in-house AI systems. 

Author

Related Articles

Back to top button