HSBC UK Data Analyst Interview: Customer Churn Prediction Case Study
汇丰银行英国DA面试:客户流失预警案例分析
2025 HSBC UK Data Analyst Interviewee
摘要 Summary
A practical Data Analyst interview experience from HSBC Wealth Management, featuring customer churn analysis and predictive modeling with business recommendations.
汇丰银行财富管理部门数据分析师面试实录,详解客户流失分析和预测建模及业务建议。
Case Background| 案例背景
The final interview case for HSBC Wealth Management DA was about customer churn prediction. The interviewer gave me an anonymized customer dataset containing demographic information, portfolio (AUM), transaction records over the past year, and a 'Churned' label. The task: 'Analyze the key drivers of customer churn and build a predictive model to identify customers at high risk of churning.'
汇丰财富管理部门DA的终面Case,是关于客户流失预警的。面试官给了我一个匿名的客户数据集,包含了客户的人口统计信息、资产组合(AUM)、过去一年的交易记录、以及一个「是否流失」的标签。任务是:「分析导致客户流失的关键驱动因素,并构建一个预测模型来识别那些有高流失风险的客户。」
My analysis process was divided into two main steps:
我的分析过程主要分为两步:
Step 1: Exploratory Data Analysis (EDA) & Insight Generation| 第一步:探索性数据分析与洞察提炼
Before modeling, I first used Python's matplotlib and seaborn libraries to conduct a series of exploratory analyses, trying to find differences between churned and non-churned customers.
在建模之前,我先用Python的matplotlib和seaborn库,对数据进行了一系列的探索性分析,试图找到流失客户和非流失客户之间的差异。
I found some interesting patterns:
我发现了一些有趣的模式:
Customer Age: Churned customers' average age (45) was significantly lower than non-churned customers (58). This might mean our appeal to younger high-net-worth customers is insufficient.
客户年龄:流失客户的平均年龄(45岁),显著低于非流失客户(58岁)。这可能意味着,我们对年轻一代高净值客户的吸引力不足。
RM Interaction: In the 3 months before churning, churned customers averaged only 1.2 interactions (calls, emails, meetings) with their Relationship Manager, while non-churned customers had 4.5. This suggests RM interaction frequency may be a key retention factor.
客户经理互动:在流失前的3个月内,流失客户与他们的客户经理的平均互动次数只有1.2次,而非流失客户是4.5次。这表明,RM的互动频率可能是一个关键的维系因素。
Portfolio Performance: Churned customers' portfolios had an average return of -2.5% over the past year, versus +1.8% for non-churned customers. Although overall market performance was poor, churned customers' losses seemed more severe.
投资组合表现:流失客户的投资组合在过去一年的平均回报率为-2.5%,而非流失客户是+1.8%。虽然市场整体表现不佳,但流失客户的亏损似乎更严重。
Product Singularity: Over 70% of churned customers held only one product (usually an investment account), while most non-churned customers had multiple products (investments, savings, credit cards, etc.).
产品单一性:超过70%的流失客户只持有一种我行的产品(通常是投资账户),而大部分非流失客户都同时拥有投资、储蓄、和信用卡等多种产品。
Step 2: Building Predictive Model & Business Recommendations| 第二步:构建预测模型与提供业务建议
Based on my findings, I built a Logistic Regression model to predict customer churn probability. I chose logistic regression because its results are very easy to explain to business stakeholders.
基于以上的发现,我构建了一个逻辑回归模型来预测客户的流失概率。我选择逻辑回归是因为它的结果非常易于向业务方解释。
Key Model Features (and their coefficients)| 模型的关键特征(及其系数)
RM_interaction_count_last_3m (coefficient: -0.85): More RM interactions, lower churn probability.
RM互动次数(系数: -0.85):RM互动次数越多,流失概率越低。
portfolio_return_last_12m (coefficient: -0.62): Lower portfolio returns, higher churn probability.
投资组合回报率(系数: -0.62):投资组合回报率越低,流失概率越高。
number_of_products (coefficient: -0.55): More product types held, lower churn probability.
持有产品种类(系数: -0.55):持有的产品种类越多,流失概率越低。
customer_age (coefficient: -0.31): Younger customers, higher churn probability.
客户年龄(系数: -0.31):年龄越小,流失概率越高。
Business Recommendations| 业务建议
In my final presentation, I didn't dwell too much on the model's AUC (though it reached 0.78). Instead, I focused on how to translate the model's insights into concrete business actions:
在最后的Presentation中,我没有过多地纠结于模型的AUC(虽然也达到了0.78),而是把重点放在了如何把模型的洞察转化成具体的业务行动上:
Establish Early Warning System: Based on the model, we can generate a daily list of 'high churn risk customers' (e.g., predicted churn probability > 70%). This list should be automatically pushed to the corresponding RMs.
建立预警系统:基于模型,我们可以每天跑出一个「高流失风险客户」的名单(比如预测流失概率 > 70%)。这个名单应该被自动地推送给相应的RM。
Empower Relationship Managers: When RMs receive alerts, the system should simultaneously provide a '360-degree view' of the customer, including recent portfolio performance and service interaction records. RMs should be required to proactively contact these customers within 24 hours.
赋能客户经理:当RM收到预警时,系统应该同时提供该客户的「360度视图」,包括他最近的投资组合表现和服务互动记录。RM应该被要求在24小时内主动联系这些客户。
Optimize Cross-Selling: For 'high-risk' customers holding only one product, we should design incentive programs to encourage them to use our other services. For example, offer preferential credit card applications to our investment customers.
优化产品交叉销售:对于那些只持有一种产品的「高风险」客户,我们应该设计一些激励方案来鼓励他们使用我们的其他服务。比如,为我们的投资客户提供一个优惠的信用卡申请链接。
Key Takeaways| 面试心得
Throughout the interview, I felt HSBC's DA strongly emphasizes 'Business Impact.' Your analysis can't just stop at 'discovering problems'—more importantly, you must be able to 'solve problems.' You need to use data to drive business decisions and ultimately bring measurable value to the company.
整个面试下来,感觉汇丰的DA非常强调「商业影响力」。你的分析不能只停留在「发现问题」的层面,更重要的是要能够「解决问题」。你需要能够用数据去驱动业务决策,并最终为公司带来可衡量的价值。
相关文章 Related Articles
Deloitte UK Consultant Interview Questions: Complete Guide with Expert Answers
德勤英国咨询顾问面试题全解析:专家级回答指南
A comprehensive guide to Deloitte UK consultant interviews, featuring 30 authentic questions with detailed answer strategies from industry insiders.
JPMorgan Data Science Analyst Interview: Technical Questions & Solutions
摩根大通数据科学分析师面试:技术问题与解决方案
Master JPMorgan's data science analyst interview with expert insights on technical questions covering machine learning, statistics, and Python programming.