Generative AI has presented financial firms with the opportunity to entirely revolutionize surveillance workflows.
When it comes to open-source Large Language Models (LLM), one of the first questions we’re often asked – besides “How do they work?” – is “How are they trained?” AI’s abilities are unprecedented, however its complexity brings about concerns from both the industry and regulators – namely around explainability, operability, data security, and transparency.
Our State of AI in Surveillance 2025 Report found that 38% of firms are watching the surveillance space to determine if they will adopt AI in surveillance. Compared to our Compliant Communications 2024 Report conducted a year prior, which found that 57% of firms did not plan to implement AI, there has been a 19% reduction in the number of firms reluctant to adopt generative models.
Financial services seem to be warming up to the idea of AI within surveillance, but it’s clear that there are still questions about how generative technology works to enhance risk monitoring and detection. We address these questions to elucidate compliance concerns, and to explain how open-source LLMs level up surveillance compared to lexicon-based and industry-specific models.
Busting the top 5 AI model myths
1. Open-source LLMs are not accurate since they don’t need to be trained to detect financial misconduct within business communications.
MYTH!
LLMs are 200 to 300 times larger than deep learning, industry-specific models and are comprised of all the data on the internet. Resultingly, they do not require additional training in the same way that small-scale industry-specific models do.
As Global Relay’s AI Product Director Yifan Xia explained in our State of AI in Surveillance Report, LLMs come with the built-in knowledge needed to capture risk, while industry-specific models require building before they’re ready to use:
“A deep learning [industry-specific] foundation model is like buying a car chassis, which you would have to assemble before you can drive it. You need to put the engine, seats, and wheels on the car. With an open LLM, you can buy a car from the dealer that’s ready to drive directly.”
Open-source LLMs are already equipped with all the reference points needed to recognize financial risks. They only need to be aligned with main risk categories and indicators specific to a firm’s risk profile to capture relevant information.
2. Open-source LLMs aren’t as easily able to detect financial misconduct since they are not industry-specific.
MYTH!
Compared to industry-specific models, which are heavily trained with information relating to a specific domain – such as finance – open-source LLMs are not trained against one dataset. Therefore, it is commonly thought that LLMs are not as apt at recognizing financial risks compared to industry-specific models.
On the contrary, open-source LLMs are more accurately able to identify risk because they are not restricted to a small subset of information. Due to their size, open-source LLMs have endless reference points to pull from when analyzing conversations. A wider scope of knowledge eliminates the chance of bias and tunnel vision, enabling LLMs to draw more well-informed conclusions.
Since LLMs have access to all the data on the internet, they have a deep understanding of multiple domains. This understanding makes transfer learning possible, where knowledge in one domain can be applied to another to enhance performance.
3. LLMs cannot provide comprehensive risk justification when flagging communications.
MYTH!
LLMs analyze entire messages to classify risk. If you tell an LLM to think like a compliance officer when evaluating a conversation to detect if market abuse is present, it will make informed decisions by extracting context and key phrases, testing against risk categories, understanding implications, and assessing sentiment.
This justification means that LLMs can more effectively present reviewers with risks that have been vetted. Contrarily, both industry-specific and lexicon-based models must be fed information to define risk areas. Unlike open-source LLMs, they are not analyzing conversations but flagging prescribed keywords and phrases for compliance teams to review.
A persisting struggle that firms run into when using lexicon-based systems is the high volume of false positives. Teams may be able to more easily trace risk alerts with lexicon banks, but conversations are falsely flagged because the net is cast wide.
Alternatively, any risks that are not explicitly prescribed within keyword lists could go undetected, such as if words are spelled incorrectly or if bad actors speak in codes to discuss misconduct.
Our State of AI in Surveillance Report found that reducing false positives and improving risk identification were the top two use cases teams hope to achieve by utilizing AI in surveillance processes. The spotlight is on surveillance more than ever before, and it’s clear that teams are beginning to see AI’s abilities to remedy long-existing pain points.
4. LLMs pose a risk to data security and compliance by covertly storing client communications.
MYTH!
Data security is a key concern for financial firms, as it’s imperative to protect sensitive client information and maintain market integrity. Our State of AI in Surveillance Report found that data security was the main barrier for firms hoping to adopt AI into their surveillance workflows.
While LLMs necessitate an abundance of information to accurately analyze conversations, they are trained using synthetic data that is modeled after real-world risk cases. This ensures that client data is never compromised.
5. There is not enough transparency around how AI models are trained and tested within surveillance.
MYTH!
Our State of AI in Surveillance Report found that AI governance frameworks are a top priority for teams when utilizing surveillance solutions. On a scale of 1 to 10, with 1 being not important and 10 being very important, respondents’ average score on the importance of third-party governance frameworks was 9.4.
Hannah Bowery, senior manager and surveillance and market abuse SME at PwC, stated that firms should be evaluating vendor documentation when choosing a solution:
“The key is having the vendor documentation and vendor information up front when you’re choosing to implement these tools. That’s the key.”
Open-source LLMs are released by major developers like Meta or OpenAI, who document how the LLM is built. Once these models are implemented within our data centers, we explain how we train, manage, and validate them to align with global AI governance regulations. Our AI governance framework also includes information about how data is documented for full clarity.
With regulators cracking down on third-party oversight and transparency, such as the Financial Industry Regulatory Authority in its 2025 Annual Oversight Report and Securities Exchange Commission in its 2025 Examination Priorities, firms must ensure they’re enlisting the help of vendors who can provide comprehensive details around how their processes work.