AIFuture of AI

External AI, its underlying risks, and how businesses can prepare

By Vaidotas Sedys, Head of Risk, Oxylabs

As AI’s presence in business operations continues to grow, companiesย arenโ€™tย aware of the risks of using open third-party AI models. The most obvious risk is dataย leakage,ย whereby private internal data is fed into a public AI model. Similarly, another risk is where systems and datasets are corrupted by malicious data flowing into a company’s network.โ€ฏย 

An inability to address such risks will leave companies in danger of financial losses and regulatory penalties, with a likelihood of reputational damage and wavering customer loyalty.โ€ฏ By following a clear two-part internal AI strategy, businesses canย utilizeย generative AI whilstย remainingย cautious and protecting the companyโ€™s data and reputation.โ€ฏโ€ฏย 

The rising wave of AI use by employeesย 

75% of workersย are using Genย AIโ€ฏ toolsย daily. With venture capital investmentย into AI companies being driven up byย over 70% in Q1 of 2025,ย there are no signs that this is going to taper off.ย Further investments will intensify competition among AI companies, resulting in enhanced functionality and lower costs.โ€ฏย 

While businessesย benefitย from rising efficiency, risks loom.ย 80% of employeesย usingย genAIย for work purposes are introducing their preferred AI tool into their work by themselves. This means most are relying on open-source third-party models and tools like ChatGPT. IT teams may not be aware of which tools are being used, how, and the risks at play.โ€ฏย 

The dangersโ€ฏย 

Third-party AI models pose many risks. Take intellectual property, for example. By using external models that may have been trained on unlicensed data, companies could expose themselves to legal challenges.โ€ฏย 

Reliance on the biggest players in generative AI is another factor to consider. If a companyโ€™s IT structureย wholly reliesย on external AI software, there is a chance it could be forced to accept whatever pricing is offered.โ€ฏโ€ฏย 

In the same vein, the business will need to reconfigure its structure when these AI providers change their APIs, and it will always be dependent on the quality of their incident response and technical support.ย 

Whilst the above risks are significant, twoย particular dangersย warrantย deeper investigation: the leaking of sensitive data to open-source AI models, and the infiltration of internal networks with inaccurate or malicious data from these models.โ€ฏย 

Internal Data Leaks: A hidden cost of AI adoptionย 

When AI is used in professional environments,โ€ฏthere is a risk of internal data leakage. This could include private client information.โ€ฏย 

Open AI models train on the information provided by the user.โ€ฏ Effectively, the data given by one user can then be received by another user to address a question that has been asked.โ€ฏย 

Thus, sensitive internal data used in prompts for these models could end up being shared publicly.โ€ฏย 

Moreover, hackers are usingย a technique called model inversionย to retrieve sensitive information by reverse engineering AI models. Hackers use input patterns to work out what dataset the model has been trained on. Byย analyzingย the response of AI models, they are then able to reconstruct portions of the original training data.โ€ฏย 

Third-party models and polluting dataย 

The damage third-party models can have on a business’s own data stream can prove vastly significant. The first form of this risk isย a fairly obviousย one โ€“ย poor qualityย or inaccurate data.โ€ฏ By some estimates, more thanย 80% of AI projects failย due to the tools used โ€“ twice the rate of failure for information technology projects that do not involve AI.โ€ฏโ€ฏย 

AI hallucination โ€“ the invention of incorrect or fake data โ€“ is a continualย risk, andย is believed to beย an inevitable part of the way large language modelsย work. So, if employees rely on poor-quality 3rd party models and thenย fail toย double-check their output, there is a risk that inaccurate or fake data will be incorporated into datasets and models within a company’s network. This can lead to incorrect outputs and thusย poor qualityย decision-making.ย 

Malicious or manipulated data is yet another risk businesses face. The notion ofย AI Jackingย is a form of AI supply chain attack where bad actors register models or datasets on AI platforms such asย Hugging Face, and then insert dangerous data into them.ย 

Hackersย exploit renamed models and datasets on AI platforms,ย frequentlyย swapping in their own corrupt version. Subsequently, when an organization unknowingly relies on these altered datasets, its projects and algorithms end up pulling malicious data controlled by the hacker instead.โ€ฏย 

Businesses also risk โ€˜backdoor attacksโ€™,ย where hidden vulnerabilities lie dormant until triggered, potentially exposing personal data. This can compromise a company’s data because the model will continue to appear normal until โ€˜triggeredโ€™ – usually by the attacker’s instigation.โ€ฏย 

Another hurdle to be aware of is โ€˜data poisoningโ€™ –ย aย way to corrupt AI behaviour and to elicit specific outputs. This involves infiltrating a model by injecting corrupted or malicious data into its training datasets. This could be done in the form ofย mislabelingย specific portions of a dataset or targeting specially crafted data samples.ย 

A dual mindset: Employee education and implementing internal toolsโ€ฏย 

There are two clear and actionable steps businesses should take to mitigate the risks of external AI on company processes.โ€ฏย 

Firstly, senior leadership teams and company directors must ensure that all employees use corporate AI tools for work-related tasks. This is largely because these tools are confined within strict network boundaries (rather than being widely accessible). This ensures sensitive internal data stays in-house and limits the potential for malign data to enter a network.ย 

Moreover, corporate-specific generative AI tools will be delivered by an AI provider that secures theย back-end. However, companies still need to ensure good data governance, asย theyย remainย responsible for the front-end โ€“ handling data and overseeing how the AI tool is used.ย 

Companies need to remain perceptive to employee behaviour when it comes to implementing generative AI. Employees are likely to form habits quickly โ€“ developing preferences forย particular toolsย and ways of working. As this happens,ย education must remain prevalent. This can support employees and foster safe adoption of gen AI.ย 

AI providers canย assistย in this process as they are likely to build in controls for corporateย AI systems. Here, businesses can train their employees on how to use such controls and ensure they are being used coherently.ย 

Similarly, companies should consider deploying internet moderation systems that redirect employees away from third-party gen AI platforms, instead encouraging them to interact with internal tools.ย 

Moreover, education is the key to mitigating risk and fostering informed AI use. Speaking realistically, many risks associated with third-party AI models are not discussed, and we cannot expect employees to beย wholly aware.โ€ฏย 

Businesses must explain the faults and limiting factors of external AI in depth to all employees,ย providingย real examples of how these dangers couldย impactย business processes. After such measures have been highlighted, employees willย likely beย more aware and willing to use in-house AI systems.ย 

Author

Related Articles

Back to top button