Blog

5 min read

FIX Protocol Best Practices for Institutional Trading

FIX protocol implementation patterns for capital markets. Session management, certification, and production operations from institutional experience.

If you have ever operated a FIX engine in production, you know that the standard is not standard. Every exchange speaks a slightly different dialect of FIX 4.4. The same tag can mean different things on different venues. Session disconnect recovery differs between CME and LSEG. Drop copy behaviour varies between brokers.

We have certified and deployed FIX engines across ICE, CME, LSEG, Eurex, and crypto venues. Here are the patterns that work in production and the ones that cause the most incidents.

Who Is This Guide For?

This guide is for trading system engineers, integration leads, and operations teams who build or maintain FIX connectivity. If you are preparing for an exchange certification or troubleshooting production FIX issues, this is for you.

By the End of This, You’ll Know…

  • How to design a FIX session management layer that survives network partitions
  • The certification workflow we use across multiple venues
  • Common production failure modes and how to detect them before they cause incidents
  • How to architect a drop copy and allocation system for multi-broker workflows

Session Management

A FIX session is a TCP connection with a FIX-level protocol layer on top. The key parameters are:

  • Heartbeat interval (HeartBtInt): Typically 30 seconds for institutional connections. Lower intervals increase bandwidth but detect disconnections faster.
  • Resend requests: When a session gap is detected, the receiving party sends a ResendRequest. The sending party replays messages from its persistent store.
  • Sequence number recovery: On reconnect, sequence numbers must be recovered to the correct state. This is where most session management bugs appear.

The pattern we use:

1
2
3
4
5
[Session Manager] → Heartbeat monitor (HeartBtInt × 3)
                   → Gap detector (seq num tracker)
                   → Resend request handler
                   → Disconnect/reconnect state machine
                   → Session state persistence

Disconnect Recovery State Machine

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
                      ┌─────────────┐
                      │  ACTIVE     │
                      └──────┬──────┘
                             │ HeartBtInt × 3 expired
                      ┌─────────────┐
                      │ RECOVERY    │──→ Attempt reconnect, resend gap
                      └──────┬──────┘
                             │ Reconnect succeeds
                      ┌─────────────┐
                      │ SYNCING     │──→ ResendRequest for missed messages
                      └──────┬──────┘
                             │ Sequence numbers match
                      ┌─────────────┐
                      │  ACTIVE     │
                      └─────────────┘

If recovery takes longer than the venue’s session timeout, the entire session must be re-established. Some venues require a new session ID after prolonged disconnection.


Exchange Certification

Each venue has its own certification process and test environment. The common elements are:

  1. Connection test: Verify logon with the correct credentials, heartbeat interval, and protocol version.
  2. Order flow test: Submit a sequence of order types — market orders, limit orders, stop orders, and algorithmic order types — and verify execution reports match expectations.
  3. Recovery test: Disconnect and reconnect mid-sequence. Verify the exchange replays the correct messages on reconnect.
  4. Drop copy test: Verify fills are correctly reported to the drop copy session (for brokers).

Common Certification Failures

FailureRoot CauseFix
Logon rejectedWrong HeartBtInt or protocol versionAlign with exchange specification
Seq num mismatch on reconnectClient reset seq numbers on disconnectPersist seq numbers across restarts
Missing execution reportsWrong tag 150 (ExecType) for order fillsCheck FIX dictionary for venue-specific values
Drop copy not arrivingDrop copy session configured on wrong portVerify venue’s drop copy session parameters

Production Operations

Monitoring

What to monitor on every FIX session:

  • Session state: Is the session ACTIVE, RECOVERING, or DISCONNECTED? Alert on any state that is not ACTIVE for more than 60 seconds.
  • Sequence number gap: A growing gap indicates messages not being acknowledged. Investigate before the gap exceeds resend request limits.
  • Latency: Monitor round-trip time from order submission to execution report, broken down by venue.
  • Message counts: Sudden drops in message volume may indicate exchange-side issues or connectivity problems.

Incident Pattern: Orphaned Messages

A common production issue: a FIX session disconnects after an order is submitted but before the execution report is received. The order fills on the exchange, but the OMS never learns about it. This is called an orphaned order.

Mitigation: Implement a session-level reconciliation at the end of each trading day. Compare orders submitted (from your FIX session log) with orders received (from the exchange’s drop copy feed). Unmatched items are suspicious and require investigation.


What You Can Actually Use Today

ToolPurposeSource
QuickFIX/JFIX engine (Java)Open source
AeronLow-latency transport for FIX gatewaysOpen source
FiximulatorFIX session testing and simulationOpen source

FAQ

Which FIX version should I use? FIX 4.4 is still the most widely supported version across institutional venues. FIX 5.0 SP2 adds support for algorithmic trading but has lower venue adoption. Use FIX 4.4 unless a specific feature in 5.0 SP2 is required.

How do I handle FIX tag customisation per venue? Use a FIX dictionary per venue. Most venues provide machine-readable FIX dictionaries. Parse them at build time to generate session-specific validation rules.

What is the difference between a FIX engine and a FIX gateway? A FIX engine is the protocol library that handles session management and message encoding/decoding. A FIX gateway is a standalone service that translates between FIX and your internal order management protocol. We typically build gateways with Aeron for the internal transport layer and FIX sessions for the external exchange-facing side.


Further Reading

For a deeper discussion of trading systems architecture, see our Trading & Market Systems Engineering service page.