Blog

April 10, 2026 4 min read

Home » Fintech & Capital Markets Engineering Insights

Multi-Region Kafka for Global Financial Services

Architecting Apache Kafka across financial data centres. Geo-replication, compliance boundaries, and disaster recovery for global trading and risk systems.

A global investment bank running trading operations across London, New York, Singapore, and Tokyo needs a messaging infrastructure that treats each region as both an independent operational domain and a participant in a global data mesh. Kafka geo-replication across financial data centres requires solving challenges that most Kafka documentation does not address.

We have deployed Kafka across multi-region architectures for tier-one banks. Here is what we learned about keeping trades flowing between London and Singapore while satisfying data residency requirements in each jurisdiction.

Who Is This Guide For?

This guide is for data platform engineers and architects at global financial institutions designing Kafka infrastructure across multiple regions. If you need streaming data to cross regulatory boundaries while maintaining operational reliability, this is for you.

By the End of This, You’ll Know…

The difference between active-active, active-passive, and active-standing Kafka architectures
How to implement geo-replication that satisfies data residency requirements
How to handle disaster recovery across regions without message loss

Kafka Architecture Patterns

Active-Passive (Most Common in Regulated Environments)

One region (e.g., London) is the primary producer and consumer of critical trading topics. Other regions consume replicated copies for risk aggregation, reporting, and disaster recovery.

1
2
3
[London] → Primary Kafka → MirrorMaker 2
                             ↓
[Singapore] → Replicated Kafka → Risk aggregation, DR

The trade-off: London is the single point of failure for trade submission, but compliance with data residency is straightforward because trading data never leaves the region.

Active-Active (Higher Complexity, Lower Latency)

Each region independently produces and consumes local trades. Replicated topics carry a global view for risk aggregation and compliance reporting.

1
2
[London] → Local trades → MM2 → Global risk topics ← MM2 ← [Singapore]
[New York] → Local trades → MM2 → Global risk topics ← MM2 ← [Singapore]

The challenge: conflict resolution for trades that span regions. We use globally unique IDs (UUIDs based on exchange, timestamp, and venue) to ensure no two regions produce the same trade ID.

MirrorMaker 2 Configuration

MirrorMaker 2 is the standard tool for Kafka geo-replication. Key configuration parameters for financial workloads:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Replication policy to preserve topic names and partition counts
replication.policy.separator=-
replication.policy.class=org.apache.kafka.connect.mirror.DefaultReplicationPolicy

# Sync consumer offsets for exact state replication
sync.topic.configs.enabled=true
sync.topic.acls.enabled=true

# Heartbeat monitoring between clusters
heartbeats.topic.interval.ms=5000

# Checkpoint replication for consumer group offset sync
checkpoints.topic.replication.factor=1
emit.checkpoints.interval.seconds=60

Data Residency Controls

Regulatory compliance requires that certain data never leaves its jurisdiction. Implement a rules engine at the MirrorMaker level:

Permit list: Topics that can be replicated (risk, research, backtesting)
Deny list: Topics that cannot be replicated (client trades, settlement data, PII)
Transform: Topics where PII must be anonymised before replication

Disaster Recovery Patterns

Regional Cluster Failure

When the primary region’s Kafka cluster fails:

Promote the secondary region’s cluster to primary by updating consumer offset tracking
Risk consumers fail over to the secondary cluster via DNS-based routing
On recovery, the primary cluster replays from the last checkpoint and catches up via MM2

Multi-Region Data Loss

In the worst case (region data loss), recovery relies on:

Topic-level replication: MM2 maintains copies in the secondary cluster
S3/object store: Cold storage backup with configurable retention (typically 90 days)
Replay: Restore from object store using Kafka Connect S3 Source Connector

What You Can Actually Use Today

Tool	Purpose	Source
MirrorMaker 2	Kafka geo-replication	Apache 2.0
Confluent Cluster Linking	Managed geo-replication	Confluent
Kafka Connect	Data integration	Apache 2.0

FAQ

How much latency does geo-replication add? MirrorMaker 2 adds 100-500ms of end-to-end latency between regions, depending on physical distance and bandwidth. For risk aggregation at minute-level granularity, this is acceptable. For real-time order routing across regions, a direct Kafka connection without MM2 is preferred.

Can I use a single Kafka cluster across multiple regions? You can, but you should not. A single cluster across an ocean introduces latency for every produce and consume operation. The cluster’s controller election becomes unreliable across geographic distances. Multi-cluster with MM2 is the standard pattern for financial services.

How do I handle schema evolution across regions? Use a central Schema Registry in a primary region. Secondary regions access the registry via read-only replicas. Schema changes are approved through a governance process and deployed first to the primary region, then propagated to downstream consumers.

Who Is This Guide For?#

By the End of This, You’ll Know…#

Kafka Architecture Patterns#

Active-Passive (Most Common in Regulated Environments)#

Active-Active (Higher Complexity, Lower Latency)#

MirrorMaker 2 Configuration#

Data Residency Controls#

Disaster Recovery Patterns#

Regional Cluster Failure#

Multi-Region Data Loss#

What You Can Actually Use Today#

FAQ#

Further Reading#