HSBC UK Software Engineer Interview: Distributed Trading Matching Engine Design

汇丰银行英国SWE面试：分布式交易撮合引擎设计

20 December 2024

1 min read

Anonymous Candidate

2025 HSBC UK Software Engineer Interviewee

摘要 Summary

An advanced system design interview from HSBC Global Banking & Markets, featuring a horizontally scalable distributed trading matching engine with sharding and consistent hashing.

汇丰银行全球银行与市场部软件工程师系统设计面试实录，详解基于分片和一致性哈希的分布式撮合引擎架构。

Case Background| 案例背景

The system design question for HSBC Global Banking & Markets (GBM) SWE final interview was designing a 'Distributed Trading Matching Engine.' Unlike the 'single-machine version' I encountered at another bank, the interviewer explicitly required the system to be 'Horizontally Scalable'—able to linearly increase throughput by adding machines.

汇丰全球银行与市场部（GBM）的SWE终面，系统设计题是设计一个「分布式交易撮合引擎」。与我在另一家银行面试时遇到的「单机版」撮合引擎不同，面试官明确要求系统必须是「水平可扩展的」，能够通过增加机器来线性提升系统的吞吐量。

This requirement makes the traditional single-threaded in-memory matching model no longer applicable. My proposed solution was a distributed architecture based on 'Sharding' and 'Consistent Hashing.'

这个要求使得传统的单线程内存撮合模型不再适用。我提出的方案是基于「分片」和「一致性哈希」的分布式架构。

Core Design Approach| 核心设计思路

1. Sharding by Trading Pair| 1. 交易对分片

I would distribute different trading pairs (e.g., BTC/USD, ETH/USD, HSBC/HKD) to different matching engine instances. For example, all BTC/USD orders go to Node A; all ETH/USD orders go to Node B. This way, each matching engine instance only maintains order books for some trading pairs, greatly reducing memory and compute pressure on individual nodes.

我会将不同的交易对（比如BTC/USD, ETH/USD, HSBC/HKD）分散到不同的撮合引擎实例上。例如，所有BTC/USD的订单都发送到节点A；所有ETH/USD的订单都发送到节点B。这样，每个撮合引擎实例都只需要维护一部分交易对的订单簿，极大地降低了单个节点的内存和计算压力。

2. Routing Layer| 2. 路由层

In front of the matching engine cluster, there needs to be an intelligent routing layer. When a customer's order (e.g., a BTC/USD buy order) enters the system, the routing layer needs to accurately forward it to the matching engine instance responsible for BTC/USD. I would use 'Consistent Hashing' algorithm for this routing logic. The benefit is that when nodes are added or removed from the cluster, only a small portion of data needs to be migrated, avoiding large-scale data reshuffling.

在撮合引擎集群的前面，需要有一个智能的路由层。当一个客户的订单进入系统时，路由层需要根据订单的交易对，准确地将它转发到负责处理该交易对的撮合引擎实例上。我会使用「一致性哈希」算法来实现这个路由逻辑。这样做的好处是，当集群中增加或减少节点时，只需要迁移一小部分数据，而不会导致大规模的数据洗牌。

3. State Persistence & High Availability| 3. 状态持久化与高可用

Although each matching engine instance performs matching in memory, every received order and generated trade must be sequentially written to a local persistence log (e.g., using RocksDB or Chronicle Queue). Additionally, I would equip each primary matching node with a hot standby secondary node. The primary node synchronously replicates its state to the secondary in real-time. If the primary fails, the system can immediately switch traffic to the secondary, achieving sub-second failover.

每个撮合引擎实例虽然是在内存中进行撮合，但必须将每一笔接收到的订单和产生的交易都顺序地写入一个本地的持久化日志中。同时，我会为每一个主撮合节点都配备一个热备份的从节点。主节点会通过同步复制的方式，将它的状态实时地同步给从节点。一旦主节点宕机，系统可以立刻将流量切换到从节点上，实现秒级的故障转移。

4. Cross-Shard Transactions| 4. 跨分片交易

The interviewer asked a tricky follow-up question: 'If a customer wants to place a 'Combo Order' across multiple trading pairs—for example, 'simultaneously buy BTC/USD and sell ETH/USD'—how would your system handle this?'

面试官追问了一个很刁钻的问题：「如果一个客户想下一个跨多个交易对的组合订单，比如'同时买入BTC/USD和卖出ETH/USD'，你的系统该如何处理？」

This is a distributed transaction problem. I admitted this is a very complex issue and proposed a solution based on 'Two-Phase Commit (2PC).' This combo order would have a 'Coordinator' role. The coordinator would first split this order into two sub-orders and send them to the BTC/USD and ETH/USD matching nodes respectively. Then, it would execute a two-phase commit protocol to ensure both sub-orders either 'succeed together' or 'fail together,' avoiding a 'half-executed' intermediate state.

这是一个分布式事务的问题。我承认这是一个非常复杂的问题，并提出了一个基于「两阶段提交（2PC）」的解决方案。这个组合订单会有一个「协调者」的角色。协调者会先把这个订单拆分成两个子订单，并分别发送给BTC/USD和ETH/USD的撮合节点。然后，它会执行一个两阶段提交的协议，来保证这两个子订单要么「同时成功」，要么「同时失败」，以避免出现「只成交了一半」的中间状态。

Key Takeaways| 面试心得

Throughout the interview, I felt HSBC GBM's SWE has very high requirements for distributed systems theory and practice. You need to deeply understand CAP theorem, consistency protocols (like Raft, Paxos), and various distributed architecture patterns. You need to think like a 'distributed systems expert' to design a financial trading system that maintains high performance and high availability at any scale.
整个面试下来，感觉汇丰GBM的SWE对分布式系统的理论和实践要求非常高。你需要深刻地理解CAP理论、一致性协议（如Raft, Paxos）、和各种分布式架构模式。你需要像一个「分布式系统专家」一样，去设计一个在任何规模下都能保持高性能和高可用性的金融交易系统。