Skip to content

3808. Find Emotionally Consistent Users

Description

Table: reactions

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| user_id      | int     |
| content_id   | int     |
| reaction     | varchar |
+--------------+---------+
(user_id, content_id) is the primary key (unique value) for this table.
Each row represents a reaction given by a user to a piece of content.

Write a solution to identify emotionally consistent users based on the following requirements:

  • For each user, count the total number of reactions they have given.
  • Only include users who have reacted to at least 5 different content items.
  • A user is considered emotionally consistent if at least 60% of their reactions are of the same type.

Return the result table ordered by reaction_ratio in descending order and then by user_id in ascending order.

Note:

  • reaction_ratio should be rounded to 2 decimal places

The result format is in the following example.

 

Example:

Input:

reactions table:

+---------+------------+----------+
| user_id | content_id | reaction |
+---------+------------+----------+
| 1       | 101        | like     |
| 1       | 102        | like     |
| 1       | 103        | like     |
| 1       | 104        | wow      |
| 1       | 105        | like     |
| 2       | 201        | like     |
| 2       | 202        | wow      |
| 2       | 203        | sad      |
| 2       | 204        | like     |
| 2       | 205        | wow      |
| 3       | 301        | love     |
| 3       | 302        | love     |
| 3       | 303        | love     |
| 3       | 304        | love     |
| 3       | 305        | love     |
+---------+------------+----------+

Output:

+---------+-------------------+----------------+
| user_id | dominant_reaction | reaction_ratio |
+---------+-------------------+----------------+
| 3       | love              | 1.00           |
| 1       | like              | 0.80           |
+---------+-------------------+----------------+

Explanation:

  • User 1:
    • Total reactions = 5
    • like appears 4 times
    • reaction_ratio = 4 / 5 = 0.80
    • Meets the 60% consistency requirement
  • User 2:
    • Total reactions = 5
    • Most frequent reaction appears only 2 times
    • reaction_ratio = 2 / 5 = 0.40
    • Does not meet the consistency requirement
  • User 3:
    • Total reactions = 5
    • 'love' appears 5 times
    • reaction_ratio = 5 / 5 = 1.00
    • Meets the consistency requirement

The Results table is ordered by reaction_ratio in descending order, then by user_id in ascending order.

Solutions

Solution 1: Grouping Statistics + Join Query

We first count the number of each reaction for every user and record it in a temporary table \(t\). Then, based on the temporary table \(t\), we calculate the maximum reaction count and total reaction count for each user, compute the reaction ratio, and filter out the users who meet the conditions, recording them in a temporary table \(s\). Finally, we join the temporary tables \(s\) and \(t\) to find the dominant reaction for each user and sort the results as required.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Write your MySQL query statement below
WITH
    t AS (
        SELECT
            user_id,
            reaction,
            COUNT(1) cnt
        FROM reactions
        GROUP BY 1, 2
    ),
    s AS (
        SELECT
            user_id,
            MAX(cnt) mx_cnt,
            ROUND(MAX(cnt) / SUM(cnt), 2) reaction_ratio
        FROM t
        GROUP BY 1
        HAVING reaction_ratio >= 0.60 AND SUM(cnt) >= 5
    )
SELECT user_id, reaction dominant_reaction, reaction_ratio
FROM
    s
    JOIN t USING (user_id)
WHERE cnt = mx_cnt
ORDER BY 3 DESC, 1;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import pandas as pd
from decimal import Decimal, ROUND_HALF_UP


def find_emotionally_consistent_users(reactions: pd.DataFrame) -> pd.DataFrame:
    t = reactions.groupby(["user_id", "reaction"]).size().reset_index(name="cnt")

    s = (
        t.groupby("user_id")
        .agg(mx_cnt=("cnt", "max"), total_cnt=("cnt", "sum"))
        .reset_index()
    )

    s["reaction_ratio"] = (
        s["mx_cnt"]
        .div(s["total_cnt"])
        .apply(
            lambda x: float(
                Decimal(str(x)).quantize(Decimal("0.00"), rounding=ROUND_HALF_UP)
            )
        )
    )

    s = s[(s["reaction_ratio"] >= 0.60) & (s["total_cnt"] >= 5)]

    merged = pd.merge(
        s[["user_id", "mx_cnt", "reaction_ratio"]],
        t,
        left_on=["user_id", "mx_cnt"],
        right_on=["user_id", "cnt"],
    )

    result = (
        merged[["user_id", "reaction", "reaction_ratio"]]
        .rename(columns={"reaction": "dominant_reaction"})
        .sort_values(by=["reaction_ratio", "user_id"], ascending=[False, True])
        .reset_index(drop=True)
    )

    return result

Comments