Future of AI

What role do generative AI techniques play in accelerating the development of multimodal ecosystems?

By Dr Baqar Rizvi, Data Scientist at Aquis Exchange

Multimodal systems embed diverse data types in a shared framework that enables more comprehensive, context aware, decision making. For financial market surveillance – which requires the oversight of millions of daily trades – a multimodal ecosystem is essential for combining trade data with the contextual information that allows us to identify manipulation or anomalies.

In recent years, stock exchanges and market makers have increasingly been focusing on developing proprietary surveillance tools which utilise machine learning to provide the industry comprehensive monitoring and anomaly detection capabilities. The University of Derby has played a pivotal role in this. These tools require a multimodal ecosystem in order to assemble the contextual pieces of information that may indicate market abuse – and generative AI techniques have been a significant contributor in allowing us to develop a suitable ecosystem for rapid and comprehensive detection.

Generative AI techniques play a transformative role in accelerating the development of multimodal ecosystems by enabling seamless integration of diverse data types, improving interpretability, and automating complex analytical tasks. Multimodal ecosystems involve multiple data streams such as text (NLP, LLMs), time-series (linear/non-linear trends), images, audio, and structured tabular data.

Unified representation learning leverages the power of LLMs like GPT-4, Gemini, and Claude to seamlessly integrate structured (e.g., financial metrics, time-series stock prices) and unstructured (e.g., earnings call transcripts, social media sentiment) data into a shared embedding space. These models use advanced vector representations to encode different data modalities, ensuring that numerical and textual information can be interpreted in a consistent manner.

For instance, a financial report containing revenue figures and qualitative management commentary can be processed together, allowing the model to detect correlations between financial performance and sentiment-driven market reactions. This unified approach is particularly valuable in stock market prediction, risk assessment, and fraud detection, as it allows AI to weigh both quantitative indicators such as P/E ratios and volatility metrics with qualitative cues before making inferences. This allows textual and tabular data (e.g., financial reports, news sentiment, and time-series market data) to be processed in a single framework (together).

Techniques such as Counterfactual Generative Models help assess how interventions, such as policy changes and interest rate hikes impact time-series trends. We can use such models for an augmented training data by establishing a context between financial reports, news headlines, and market commentary to improve training for sentiment analysis or fraud detection models. Such models can also be used to create synthetic data sets of all the above types for model training.

Multimodal AI models like CLIP and GPT-4V play a crucial role in integrating diverse financial data types, such as charts, technical indicators, and earnings transcripts, to provide a more holistic analysis of market behaviour. Traditional financial models often rely on structured numerical data, but multimodal AI enables richer insights by combining visual, textual, and quantitative signals.

Cross-attention mechanisms in multimodal AI architectures significantly improve market analysis by allowing models to focus on meaningful interactions between different data modalities. These mechanisms dynamically allocate attention to relevant input features, helping models understand how financial news impacts stock price fluctuations. For instance, if a central bank announces an unexpected interest rate hike, cross-attention mechanisms can correlate this event with historical market reactions, ensuring the model prioritizes this news when analysing stock volatility. This capability extends to real-time event tracking, where AI can weigh the influence of different data sources—such as news sentiment, earnings reports, and historical price movements.

Generative AI can play a pivotal role in detecting specific market manipulation like pump and dump, insider trading by providing the relationship between posts (news articles, social media, bulletin boards etc) and Aquis’ machine learning analysis.

Without the utilisation of generative AI techniques, the development of  market-manipulation detection models based on the Dendritic cell algorithm would have taken significantly longer, lacking in the breadth of contextual information and relevant data collection that these techniques have allowed the industry to integrate. Further information on the DCA detection model has been published in Intelligent Systems Design and Applications and is available here.

Researchers at the University of Derby are continuing to develop cutting-edge machine-learning techniques to aid in market surveillance, and its upcoming work on contextual market manipulation targets study the impact of overlapping data patterns, normal and anomalous ones, how they effect the detection accuracy leading to false positives. It also proposes a detection model capable of distinguishing the two, irrespective of observed behaviour and significant improvement in detection rates.

Generative AI techniques have been significantly helpful in collating and contextualising a vast set of varied data to feed into our work on stock market surveillance, and will continue to contribute to our development of a comprehensive multimodal ecosystem within which our anomaly-detection work operates.

Author

Related Articles

Back to top button