Financial Services
Revolut is rebuilding its AI stack around a single foundation model powered by NVIDIA accelerated computing. By training PRAGMA, a family of encoder‑style models running on NVIDIA accelerated computing, Revolut is replacing multiple siloed systems across fraud, credit, engagement, and recommendations with one shared behavioral intelligence layer.
Revolut
Nebius
Accelerated Computing Tools & Techniques
3–5× faster model development
2–5× higher training throughput
Major gains over production baselines
Founded in London in 2015 to give people a fairer deal on foreign exchange (FX), Revolut today helps more than 70 million customers in 40 markets manage their money more easily and cheaply, with products spanning banking, payments, FX, credit, and wealth management.
Revolut’s goal with PRAGMA was to learn rich behavioral representations directly from raw event streams — then reuse those embeddings across risk, growth, and product workflows. To do this at foundation‑model scale on financial data, the team needed high‑performance accelerated computing and a mature deep learning software platform.
PRAGMA is a family of transformer-based behavioral models that interprets each customer’s financial journey as a rich temporal signal, similar to how advanced language models interpret sequences of text. It is available in configurations from tens of millions to one billion parameters, enabling deployments that range from ultra‑efficient, low‑latency inference to large‑scale models optimized for maximum predictive accuracy.
The architecture combines three specialized encoders — for user attributes, individual events, and long‑term history — to transform profile data and time‑ordered transaction streams into a unified behavioral representation. To preserve both sequence structure and numerical fidelity, PRAGMA uses a structured tokenization approach that indexes categorical fields, quantizes continuous values such as transaction amounts, and decomposes timestamps into interpretable temporal components.
Inter‑event timing is modeled with a smooth logarithmic transformation, allowing the system to capture short‑term behaviors and long‑horizon life events within a single temporal framework. NVIDIA Llama‑Nemotron‑Embed‑1B‑v2 is used to embed unstructured text fields such as merchant descriptions, enriching the behavioral signal and delivering a measured 16.1% improvement in credit risk prediction performance.
Source: PRAGMA: Revolut Foundation Model, arXiv:2604.08649, CC BY 4.0.
Before PRAGMA, Revolut followed the same pattern as much of the banking industry: dedicated machine learning pipelines for each task, from fraud detection and credit scoring to marketing response and lifetime value prediction. Each model depended on its own hand‑crafted features, bespoke structured query language, and ETL (extract, transform, load), so launching a new use case or entering a new market meant repeating months of feature engineering and validation.
This created slow experimentation, fragmented decisions, and scaling limits as user growth accelerated. Fraud and credit systems could be working from different representations of the same customer history, making it difficult to optimize risk and growth holistically. At an industry level, there was no general‑purpose architecture that could simultaneously handle heterogeneous banking data, long‑range temporal patterns, and strict privacy constraints while remaining efficient to train.
Revolut pre‑trained PRAGMA on about 26 million user records across 111 countries, spanning roughly 24–40 billion events and 207 billion tokens over around 28 months of history. Training ran on clusters of NVIDIA H100 GPUs on Nebius, with PRAGMA‑S (10 million parameter model) converging in about two days on 16 GPUs and larger variants taking roughly two weeks on 16–32 GPUs.
Revolut’s AI engineering team built shard‑based dynamic batching with fixed GPU memory token budgets and leveraged variable‑length attention kernels to reduce padding. These optimizations drove 2–5× higher throughput than padded baselines, maximizing utilization of H100 GPUs and keeping pre‑training within tight time and cost budgets.
Once pre‑trained, PRAGMA serves as a shared backbone for multiple financial tasks. Teams can either freeze the model and train lightweight linear heads on top of embeddings for rapid experiments or apply LoRA fine‑tuning, updating only 2%–4% of parameters.
This design lets fraud, credit, marketing, and product teams spin up new models in days instead of months, often without building new feature pipelines. PRAGMA‑S delivers sub‑second latency for real‑time fraud screening at the point of transaction, while larger variants handle accuracy‑sensitive workloads where latency is less critical.
By consolidating on a PRAGMA backbone running on NVIDIA H100 GPUs, Revolut reports both significant efficiency gains and large performance lifts over strong baselines.
Model development cycles are now roughly 3–5× faster, because teams reuse shared embeddings and adapters rather than hand‑crafting features for each new market or product. On the training side, sequence packing and dynamic batching on H100s deliver up to 2–5× throughput improvement, supporting regular refreshes of the backbone as data and markets evolve.
On Revolut’s internal benchmarks, PRAGMA delivers a 64.7% improvement in fraud recall and a 16.7% lift in fraud precision over the prior production model.
Since all of these improvements happen at the backbone level, every update to PRAGMA flows through to multiple business units — from fraud and credit to marketing and product — compounding AI quality across the company. Data science teams benefit from less duplicated code, simpler governance, and unified monitoring, while the business can enter new markets and launch new features without rebuilding its machine learning stack from scratch.
Revolut plans to expand PRAGMA with more multimodal inputs, enable continual pre‑training so the model can learn from new events more continuously, and broaden downstream use cases to include lifetime value prediction, churn forecasting, and anomaly detection.
On the app side, Revolut is curating a model‑driven interface that adapts in real time to each user’s behavior. On the infrastructure side, the team is exploring AutoML integration, embedding versioning, and a full inference pipeline.
Backed by NVIDIA accelerated computing and research collaboration — Revolut is turning its growing behavioral dataset into a durable competitive advantage, redefining what an AI‑powered global bank can look like.
Learn more about the collaboration between Revolut and NVIDIA.