Skip to content

3716. Find Churn Risk Customers

Description

Table: subscription_events

+------------------+---------+
| Column Name      | Type    |
+------------------+---------+
| event_id         | int     |
| user_id          | int     |
| event_date       | date    |
| event_type       | varchar |
| plan_name        | varchar |
| monthly_amount   | decimal |
+------------------+---------+
event_id is the unique identifier for this table.
event_type can be start, upgrade, downgrade, or cancel.
plan_name can be basic, standard, premium, or NULL (when event_type is cancel).
monthly_amount represents the monthly subscription cost after this event.
For cancel events, monthly_amount is 0.

Write a solution to Find Churn Risk Customers - users who show warning signs before churning. A user is considered churn risk customer if they meet ALL the following criteria:

  • Currently have an active subscription (their last event is not cancel).
  • Have performed at least one downgrade in their subscription history.
  • Their current plan revenue is less than 50% of their historical maximum plan revenue.
  • Have been a subscriber for at least 60 days.

Return the result table ordered by days_as_subscriber in descending order, then by user_id in ascending order.

The result format is in the following example.

 

Example:

Input:

subscription_events table:

+----------+---------+------------+------------+-----------+----------------+
| event_id | user_id | event_date | event_type | plan_name | monthly_amount |
+----------+---------+------------+------------+-----------+----------------+
| 1        | 501     | 2024-01-01 | start      | premium   | 29.99          |
| 2        | 501     | 2024-02-15 | downgrade  | standard  | 19.99          |
| 3        | 501     | 2024-03-20 | downgrade  | basic     | 9.99           |
| 4        | 502     | 2024-01-05 | start      | standard  | 19.99          |
| 5        | 502     | 2024-02-10 | upgrade    | premium   | 29.99          |
| 6        | 502     | 2024-03-15 | downgrade  | basic     | 9.99           |
| 7        | 503     | 2024-01-10 | start      | basic     | 9.99           |
| 8        | 503     | 2024-02-20 | upgrade    | standard  | 19.99          |
| 9        | 503     | 2024-03-25 | upgrade    | premium   | 29.99          |
| 10       | 504     | 2024-01-15 | start      | premium   | 29.99          |
| 11       | 504     | 2024-03-01 | downgrade  | standard  | 19.99          |
| 12       | 504     | 2024-03-30 | cancel     | NULL      | 0.00           |
| 13       | 505     | 2024-02-01 | start      | basic     | 9.99           |
| 14       | 505     | 2024-02-28 | upgrade    | standard  | 19.99          |
| 15       | 506     | 2024-01-20 | start      | premium   | 29.99          |
| 16       | 506     | 2024-03-10 | downgrade  | basic     | 9.99           |
+----------+---------+------------+------------+-----------+----------------+

Output:

+----------+--------------+------------------------+-----------------------+--------------------+
| user_id  | current_plan | current_monthly_amount | max_historical_amount | days_as_subscriber |
+----------+--------------+------------------------+-----------------------+--------------------+
| 501      | basic        | 9.99                   | 29.99                 | 79                 |
| 502      | basic        | 9.99                   | 29.99                 | 69                 |
+----------+--------------+------------------------+-----------------------+--------------------+

Explanation:

  • User 501:
    • Currently active: Last event is downgrade to basic (not cancelled) 
    • Has downgrades: Yes, 2 downgrades in history 
    • Current revenue (9.99) vs max (29.99): 9.99/29.99 = 33.3% (less than 50%) 
    • Days as subscriber: Jan 1 to Mar 20 = 79 days (at least 60) 
    • Result: Churn Risk Customer
  • User 502:
    • Currently active: Last event is downgrade to basic (not cancelled) 
    • Has downgrades: Yes, 1 downgrade in history 
    • Current revenue (9.99) vs max (29.99): 9.99/29.99 = 33.3% (less than 50%) 
    • Days as subscriber: Jan 5 to Mar 15 = 70 days (at least 60) 
    • Result: Churn Risk Customer
  • User 503:
    • Currently active: Last event is upgrade to premium (not cancelled) 
    • Has downgrades: No downgrades in history 
    • Result: Not at-risk (no downgrade history)
  • User 504:
    • Currently active: Last event is cancel
    • Result: Not at-risk (subscription cancelled)
  • User 505:
    • Currently active: Last event is 'upgrade' to standard (not cancelled) 
    • Has downgrades: No downgrades in history 
    • Result: Not at-risk (no downgrade history)
  • User 506:
    • Currently active: Last event is downgrade to basic (not cancelled) 
    • Has downgrades: Yes, 1 downgrade in history 
    • Current revenue (9.99) vs max (29.99): 9.99/29.99 = 33.3% (less than 50%) 
    • Days as subscriber: Jan 20 to Mar 10 = 50 days (less than 60) 
    • Result: Not at-risk (insufficient subscription duration)

Result table is ordered by days_as_subscriber DESC, then user_id ASC.

Note: days_as_subscriber is calculated from the first event date to the last event date for each user.

Solutions

Solution 1: Grouping Statistics + Join + Conditional Filtering

We first use a window function to get the last record for each user sorted by event date and event ID in descending order, obtaining the latest event information for each user. Then, we group and aggregate the subscription history information for each user, including the subscription start date, last event date, historical maximum subscription fee, and the number of downgrade events. Finally, we join the latest event information with the historical statistics and filter according to the conditions specified in the problem to get the list of customers at risk of churn.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
WITH
    user_with_last_event AS (
        SELECT
            s.*,
            ROW_NUMBER() OVER (
                PARTITION BY user_id
                ORDER BY event_date DESC, event_id DESC
            ) AS rn
        FROM subscription_events s
    ),
    user_history AS (
        SELECT
            user_id,
            MIN(event_date) AS start_date,
            MAX(event_date) AS last_event_date,
            MAX(monthly_amount) AS max_historical_amount,
            SUM(
                CASE
                    WHEN event_type = 'downgrade' THEN 1
                    ELSE 0
                END
            ) AS downgrade_count
        FROM subscription_events
        GROUP BY user_id
    ),
    latest_event AS (
        SELECT
            user_id,
            event_type AS last_event_type,
            plan_name AS current_plan,
            monthly_amount AS current_monthly_amount
        FROM user_with_last_event
        WHERE rn = 1
    )
SELECT
    l.user_id,
    l.current_plan,
    l.current_monthly_amount,
    h.max_historical_amount,
    DATEDIFF(h.last_event_date, h.start_date) AS days_as_subscriber
FROM
    latest_event l
    JOIN user_history h ON l.user_id = h.user_id
WHERE
    l.last_event_type <> 'cancel'
    AND h.downgrade_count >= 1
    AND l.current_monthly_amount < 0.5 * h.max_historical_amount
    AND DATEDIFF(h.last_event_date, h.start_date) >= 60
ORDER BY days_as_subscriber DESC, l.user_id ASC;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
import pandas as pd


def find_churn_risk_customers(subscription_events: pd.DataFrame) -> pd.DataFrame:
    subscription_events["event_date"] = pd.to_datetime(
        subscription_events["event_date"]
    )
    subscription_events = subscription_events.sort_values(
        ["user_id", "event_date", "event_id"]
    )
    last_events = (
        subscription_events.groupby("user_id")
        .tail(1)[["user_id", "event_type", "plan_name", "monthly_amount"]]
        .rename(
            columns={
                "event_type": "last_event_type",
                "plan_name": "current_plan",
                "monthly_amount": "current_monthly_amount",
            }
        )
    )

    agg_df = (
        subscription_events.groupby("user_id")
        .agg(
            start_date=("event_date", "min"),
            last_event_date=("event_date", "max"),
            max_historical_amount=("monthly_amount", "max"),
            downgrade_count=("event_type", lambda x: (x == "downgrade").sum()),
        )
        .reset_index()
    )

    merged = pd.merge(agg_df, last_events, on="user_id", how="inner")
    merged["days_as_subscriber"] = (
        merged["last_event_date"] - merged["start_date"]
    ).dt.days

    result = merged[
        (merged["last_event_type"] != "cancel")
        & (merged["downgrade_count"] >= 1)
        & (merged["current_monthly_amount"] < 0.5 * merged["max_historical_amount"])
        & (merged["days_as_subscriber"] >= 60)
    ][
        [
            "user_id",
            "current_plan",
            "current_monthly_amount",
            "max_historical_amount",
            "days_as_subscriber",
        ]
    ]

    result = result.sort_values(
        ["days_as_subscriber", "user_id"], ascending=[False, True]
    ).reset_index(drop=True)
    return result

Comments