What is RAG in fintech and how financial services are using it with LLMs to power AI innovation: By John Adam

Blog

Last updated: July 19th, 2025 at 09:16 am

To operate, organisations in the financial services sector require hundreds of thousands of documents of rich, contextualised data. And to organise, analyse and then use that data, they are increasingly turning to AI.

One AI-driven approach to unifying, understanding, structuring and then accessing stores of internal data is the LLM (Large Language Model). LLMs can be trained on vast amounts of information—up to 300 trillion data points—but are nonetheless limited to
the data they are trained on and, if made available to the model, information sourced online.

This can be an issue for financial organisations whose processes often need to take constantly updating data and documentation, like regulatory changes, into account. The system cannot access new documents or fresh data without either modifying the LLM’s
parameters with fine-tuning or retraining it, which is labor-intensive and can take months, making results either outdated or contextually incorrect.

The solution to this limitation is Retrieval Augmented Generation models, or RAG. Fintechs use RAG extensions with underlying LLMs to add and contextualise new documentation, data, and history.

And like all worthwhile projects, it will take time and effort to build, integrate, and get it right.

For organisations with limited resources, like small and medium-sized fintechs, the most pressing question is: Which AI tools can move the needle without taking a huge risk?

RAG is the straightforward answer here. It provides a clear opportunity to organise and source factual information for AI tools, like LLMs, and provides the data and information architecture required for future AI tools to run well.

What is retrieval augmented generation and how RAG models are used in finance

RAG offers a workaround to continuously retraining LLMs by directly connecting the system to all approved documents made available within an organisation, and citing them in generated responses.

It searches internal document repositories, including the most recent documents that an underlying LLM wasn’t trained on, to find the most up-to-date and contextually relevant information, improving the accuracy of an LLM’s outputs.

For example, let’s say your organisation invested in building a private LLM and trained it on internal data at the end of 2024. Without RAG or retraining, it is impossible for the system to access and be updated on documentation from the first two quarters
of 2025.

This is a weakness to overcome for any team exploring how to effectively drive value from the application of large language models in finance – a sector with a high rate of change and influx of fresh documentation. As this flaw in LLMs quickly makes responses
to the most urgent and time-sensitive queries, which often depend on recent data, useless and obsolete.

But by adding a RAG layer, your LLM is able to access internal and external documents it wasn’t trained on—like client interviews or portfolio performance data from this quarter, or regulatory changes—retrieve relevant information, and relay it to the LLM
to support the system in generating a contextually accurate result.

Prompting a RAG-powered LLM, you could ask “What are the key points in our latest regulatory report?” and get a contextually accurate response, even if the report was written years after the LLM was trained.

This flexibility lets financial services organisations build out easily accessible internal knowledge bases for their teams, to be used to service clients or answer internal queries.

Last year, Morgan Stanley, for example, worked with OpenAI to build a RAG-driven AI assistant. 98% of their advisory teams use it to search over 100,000 internal documents and research reports to help them quickly find specific information in thousands of
documents, summarise research, and answer client queries.

How to leverage RAG in fintech

The components of retrieval augmented generation let fintechs unify hundreds of thousands of documents and turn their LLMs into an ultimate knowledge hub with factual retrieval of all documentation of their current and past client base, policy, the current
market, etc.

Previously, the struggle has always been putting that information together and organising it in a way that’s easy and relatively quick for you and your team to access. And here it is, a solution that provides unified access to your organisation’s entire
history, and teams can use it without devoting days or even hours.

That information can also be made available to custom agents designed to automate different tasks across business processes, from customer support to research and regulatory processes like KYC.

RAGs make AI tools like LLMs immensely powerful – the number of potential applications of the combination is spectacular.

There are already successful examples of financial services companies running RAG systems over their own private LLMs, building company-specific models for a particular group of tasks. One example is Bloomberg’s BloombergGPT, a 50-billion parameter model
trained on financial data that outperforms general models for specific tasks in the finance space.

But most, if not all small and medium-sized fintechs lack the resources required to build and train an LLM of BloombergGPT’s caliber.

Which means some organisations fine-tune popular general models rather than building their own private LLM. For example, the Enterprise version is making the use of ChatGPT in fintech increasingly popular. Organisations can run RAG over the Enterprise version
of ChatGPT to inform responses with proprietary and market data to improve the accuracy of its outputs.

One example is JPMorgan’s Quest IndexGPT that uses OpenAI’s GPT-4 to broaden the selection of stocks and build stronger thematic portfolios based on live insights from the current market.

For organisations that compete directly with enterprise-level businesses but can’t build their own private LLMs, RAG represents an opportunity to amass information and documentation in an organised way as they grow.

Well-organised teams are at an advantage. If you can quickly source insights from tightly organised documentation, cite policy, and access client reports, efficiency gains can propel you ahead of competitors. Even more significant are the automations that
can be powered by proprietary data and documentation held within a RAG.

Efficiency gains are multiplied tenfold with RAG compared to a general-purpose LLM: insights are contextually relevant and sourced from the most recent documentation, and citations are transparent as the system provides exact sources.

As RAG-driven systems become the new norm, organisations lagging behind in adoption risk adversely impacting their ability to compete with more innovative competitors in the near future.

Especially considering that, in the long term, unifying a complete knowledge base will take several years.

Benefits of RAG models in fintech

RAG improves security and compliance, accuracy, and transparency when used alongside AI tools.

Security and compliance

Like all AI tools,RAG requires comprehensive governance. Role-based permissions and access can also be added to make it impossible for secure information to be accessed or changed in any way by users who lack the correct permissions.

Private, cloud-based LLMs with RAG are less vulnerable to security risks. As RAG is an information retrieval tool, it makes it possible to simply source sensitive information as it’s needed without giving an LLM unfettered access or training it on high-risk
documentation.

But for systems that use a general or Enterprise-version LLM, governance requiring the encryption of all documents in transit and at rest keeps any private information RAG models retrieve secure.

Accuracy

RAG systems force LLMs to pull information directly from relevant documents to ground responses in fact.

Traditional LLMs don’t offer this benefit; they’re prone to hallucinating when prompted with a question they don’t know the answer to. But a RAG system can be trained to simply respond “I’m not sure” if it can’t find an answer and then cite relevant information
that may be useful while not contextually perfect.

An important note here is that, to produce the most accurate responses, AI tools and RAG models should be trained well.

Including specific keywords and phrases an organisation uses, common processes and preferences, and internal documentation RAG can use to contextualise responses.

Transparency

The nature of RAG systems provides transparency that is otherwise unavailable in traditional ’black box’ AI tools and LLMs, as they source findings directly in documentation, and link and cite the source in their responses.

This clarity simplifies and speeds up the secondary fact-checking teams might do.

RAG can also log all interactions, including prompts, the exact data sourced, and the response it generated, making use and performance auditable.

Each of these benefits makes a sizable impact on risk and explainability issues that plague many AI tools and makes RAG models more persuasive: compliance can be built in to improve privacy, and transparent interaction logs make AI tools understandable for
regulatory institutions and the teams that use them.

Fintech applications of RAG

RAG systems are able to leverage data and information hidden in complex written documents, charts, graphs, and tables, often split between siloed repositories. This allows it to offer the kind of data-driven insights that would previously have meant extensive
manual research and analysis.

In action, nearly immediate access to every single document in an organisation and a fast analysis of pertinent information make way for a wide variety of use cases.

For example, JPMorgan’s Quest IndexGPT utilises OpenAI’s ChatGPT-4 and API for internal data and use, while keeping the product—the index’s intellectual property—owned by the bank.

RAG runs on Quest IndexGPT to securely source relevant news, brainstorm keywords, eg. “renewable energy in emerging markets” and generate a list for the relevant investment theme, as well as find and suggest stocks and thematic indexes.

But young fintechs aren’t just treading water alongside deep-pocketed enterprises— innovative scale-ups are unveiling some of the most creative RAG use cases:

bunq launched their AI assistant, “Finn”, that runs in a chatbot-like interface on their app and lets users interact with their bank accounts via conversation.

Clients can ask questions like “how much money did I spend eating in restaurants in 2024?” Or even “what was that Indian restaurant I went to with a friend in London?” and get an immediate, accurate answer based on RAG’s retrieval of transaction history
and geolocation data.

bunq recently hit the 11 million users mark, which leadership credits to its innovative approach to AI-augmented personal banking.

A few other use cases for RAG in fintech include:

Automating KYC/AML document reviews
Quickly summarising regulations and contracts
Drafting customer/client communications and financial reports

Supporting internal helpdesks and compliance teams with fast search and access to up-to-date regulatory documents.

The near and far future of RAG in financial services

If you take RAG to its logical conclusion, you could have literally every internal and external conversation and piece of data in the entire company within a structured internal AI—which would be a source of internal knowledge that no company has ever had.

According to a UK Finance report in January, 75% of execs at large financial organisations and 85% at small and medium-sized financial organisations are satisfied with generative AI’s impact on ROI.

What RAG in fintech is doing for AI tools in the sector is the missing puzzle piece to get that percentage closer to 100.

Soon, RAG will be widely used for real-time streaming data and multi-modal inputs, directly accessing live news. A trader’s bot assistant will be able to get an accurate answer to questions like “What’s the price and news on stock XYZ right now?” immediately.
More importantly, that information will inform follow-up actions both at the initiative of the trader and by linking the information to other data and info, to suggest actions.

As far as fintech AI tools in 2025 go, investment into RAG-based systems and generative AI is creating real value for businesses, and access to precise, real-time insights will soon become a standard expectation for users, as we’ve seen in cases like bunq’s.

By starting to invest in RAG now, choosing what to feed systems, and how to architect data access and internal governance policy, small and medium-sized businesses can begin to strategically build up knowledge bases.

In the near term, these knowledge stores can be used for ROI-driven applications to improve current standing—like documenting and organising client queries to streamline responses.

In the long-term, organisations that have built structured internal data and knowledge stores using RAG systems will be able to decide how to combine, cut, or give tiered access to and build out various kinds of agentic services in future —and gain a strategic
advantage in adopting new AI tools and systems. They’ll be the ones to spot and seize opportunities in the market more quickly than less organised competitors.

Retrieval augmented generation in fintech presents the opportunity for small and medium-sized businesses to bridge the gap to larger competitors and, why not, even take the lead.

Source link