Private AI infrastructure for a regulated financial institution.
Self-hosted large language models deployed inside the bank's own network. No customer data leaves the environment. Full regulatory governance from day one.
"Most consultancies told us to plug in OpenAI. Stratus built us a private AI platform that keeps our data inside our walls and gives our regulators exactly what they need."
Chief Technology Officer, UK Financial InstitutionOne platform. Four business lines. Zero data leaving the building.
A UK financial institution wanted to deploy AI across fraud detection, credit risk assessment, regulatory reporting, and customer intelligence. But every option on the market had the same problem. Third-party AI services like OpenAI require customer data to leave the bank's network. Regulators will not accept that. The cost of API calls was already running into six figures monthly with no visibility into usage. And the FCA requires full explainability and audit trails for any AI involved in financial decisions. Off-the-shelf tools could not provide this. The bank needed its own private AI platform, purpose-built for a regulated environment.
| Capability | Before (Third-Party AI APIs) | After (Private AI Platform) |
|---|---|---|
| Data Residency | Customer data sent to external AI providers. No control over where it is processed | All inference inside the bank's private VPC. Zero data leaves the network boundary. |
| Cost Visibility | Six-figure monthly API spend. No breakdown by team or use case. No cost controls | Token-level usage tracking per team. FinOps dashboards. Automated chargeback per business line. |
| Model Governance | Black-box AI. No visibility into model behaviour. No audit trail for regulators | Every prompt and response logged. Model versioning, bias testing, and drift detection automated. |
| Scalability | Rate-limited by third-party provider. Latency spikes during peak periods | Private GPU fleet with auto-scaling. Consistent sub-200ms inference latency. |
| Regulatory Compliance | No FCA-compliant governance. No explainability. No model cards | Full model governance framework. FCA-ready audit trail. Explainability built into every response. |
Private AI Platform Architecture
Business applications connect to a centralised model gateway. The gateway routes requests to the optimal model based on task complexity. All inference runs on private GPU instances inside the bank's VPC.
flowchart TB
subgraph VPC["Bank's Private VPC"]
subgraph Apps["Business Applications"]
FD["Fraud\nDetection"]
CR["Credit\nRisk"]
CO["Compliance"]
CI["Customer\nIntel"]
end
FD --> GW["Model Gateway\n(Central Router)"]
CR --> GW
CO --> GW
CI --> GW
subgraph GPU["Private GPU Fleet"]
SM["Small Model\n(Mistral 7B)\nFast Tasks"]
LM["Large Model\n(Llama 70B)\nComplex Tasks"]
FT["Fine-Tuned Models\nBank-Specific"]
end
GW --> SM
GW --> LM
GW --> FT
subgraph GOV["Governance Layer"]
AL["Audit\nLogger"]
MR["Model\nRegistry"]
DM["Drift\nMonitor"]
end
GW --> AL
GW --> MR
GW --> DM
end
style VPC fill:#0B0C10,stroke:#7c3aed,color:#fff
style Apps fill:#1a1a2e,stroke:#7c3aed,color:#fff
style GPU fill:#1a1a2e,stroke:#7c3aed,color:#fff
style GOV fill:#1a1a2e,stroke:#7c3aed,color:#fff
style FD fill:#6b21a8,stroke:#7c3aed,color:#fff
style CR fill:#6b21a8,stroke:#7c3aed,color:#fff
style CO fill:#6b21a8,stroke:#7c3aed,color:#fff
style CI fill:#6b21a8,stroke:#7c3aed,color:#fff
style GW fill:#4c1d95,stroke:#7c3aed,color:#fff
style SM fill:#1a1a2e,stroke:#7c3aed,color:#fff
style LM fill:#1a1a2e,stroke:#7c3aed,color:#fff
style FT fill:#6b21a8,stroke:#7c3aed,color:#fff
style AL fill:#1a1a2e,stroke:#7c3aed,color:#fff
style MR fill:#1a1a2e,stroke:#7c3aed,color:#fff
style DM fill:#1a1a2e,stroke:#7c3aed,color:#fff
← Scroll to explore diagram →
The Private AI Architecture Stack
Every component earns its place by solving a specific infrastructure, cost, or governance challenge. Nothing generic. Nothing unnecessary.
- Amazon EC2 GPU Instances (p4d/g5) Dedicated GPU compute for model inference. Auto-scaling fleet that scales up during business hours and scales down overnight. No shared tenancy.
- Amazon SageMaker Endpoints Model serving with A/B testing and canary deployments. New model versions rolled out gradually with automatic rollback if accuracy drops.
- Amazon VPC + PrivateLink Complete network isolation. All traffic stays inside the bank's VPC. No internet egress. Private endpoints for every service.
- Model Gateway (ECS Fargate) Centralised routing layer. Analyses each request and sends it to the best model for the job. Tracks token usage and cost per request.
- Fine-Tuning Pipeline (SageMaker) Automated retraining on the bank's own data. Models improve continuously as new transaction patterns and regulatory guidance emerge.
- Governance Dashboard (OpenSearch + S3) Real-time visibility into model performance, cost per business line, prompt/response logs, and regulatory compliance status.
What the AI actually does, every day
Six capabilities running on one platform. Each one replaces a manual process, reduces risk, or surfaces intelligence that was previously invisible to the business.
Transaction Monitoring
Credit Risk Assessment
Regulatory Document Analysis
Customer Intelligence
Network Analysis for Money Laundering
Predictive Model Drift Detection
From scattered API calls to one governed platform
Four business lines now access AI through a single gateway. Every request is routed, every token is tracked, and every model is versioned for regulatory traceability.
- AI as a Governed Internal Service Four business lines access AI through one central gateway. No team runs their own models. No shadow AI. No ungoverned experiments. One platform, fully controlled.
- Cost Under Control Token-level tracking per team and use case. Monthly FinOps reports. AI spend dropped 70% compared to external API calls.
- Continuous Improvement Models retrained weekly on new data. Performance monitored for drift. The platform gets smarter without manual intervention.
- Private GPU infrastructure on EC2 with auto-scaling and spot instance optimisation
- Model gateway with intelligent routing, token tracking, and cost allocation
- Fine-tuning pipeline for bank-specific model training on internal data
- Governance framework with audit logging, model cards, bias testing, and drift detection
- FinOps dashboard with per-team cost visibility and chargeback reporting
- Onboarding playbook for new business lines to connect in under one week
Built for the regulator, not retrofitted
The FCA expects institutions using AI in financial decisions to demonstrate explainability, fairness, and accountability. The PRA requires operational resilience of critical AI systems. This platform was designed to satisfy both from the ground up.
- FCA Explainability Requirements Every AI-assisted decision includes a human-readable explanation. Credit decisions, fraud alerts, and compliance flags all come with clear rationale.
- PRA Operational Resilience Private GPU fleet with multi-AZ deployment. No single point of failure. If one GPU instance fails, traffic routes automatically to healthy instances.
- Model Risk Management Full model lifecycle governance. Version control, performance baselines, bias testing, and automated drift detection. Model cards maintained for every model in production.
- FCA audit-ready from day one. Full prompt/response logs, model versioning, and decision traceability available on demand.
- Zero data sovereignty incidents. No customer data has left the bank's network boundary since go-live.
- 70% cost reduction. Consolidated four separate AI vendor contracts into one internal platform with full cost visibility.
Ready to Own Your
AI Infrastructure?
Third-party AI APIs create data sovereignty risk and uncontrolled costs. We build private AI platforms that keep your data inside your walls and give your regulators exactly what they need.