3601. Find Drivers with Improved Fuel Efficiency

Description

Table: drivers

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| driver_id   | int     |
| driver_name | varchar |
+-------------+---------+
driver_id is the unique identifier for this table.
Each row contains information about a driver.

Table: trips

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| trip_id       | int     |
| driver_id     | int     |
| trip_date     | date    |
| distance_km   | decimal |
| fuel_consumed | decimal |
+---------------+---------+
trip_id is the unique identifier for this table.
Each row represents a trip made by a driver, including the distance traveled and fuel consumed for that trip.

Write a solution to find drivers whose fuel efficiency has improved by comparing their average fuel efficiency in the first half of the year with the second half of the year.

Calculate fuel efficiency as distance_km / fuel_consumed for each trip
First half: January to June, Second half: July to December
Only include drivers who have trips in both halves of the year
Calculate the efficiency improvement as (second_half_avg - first_half_avg)
Round all results to 2 decimal places

Return the result table ordered by efficiency improvement in descending order, then by driver name in ascending order.

The result format is in the following example.

Example:

Input:

drivers table:

+-----------+---------------+
| driver_id | driver_name   |
+-----------+---------------+
| 1         | Alice Johnson |
| 2         | Bob Smith     |
| 3         | Carol Davis   |
| 4         | David Wilson  |
| 5         | Emma Brown    |
+-----------+---------------+

trips table:

+---------+-----------+------------+-------------+---------------+
| trip_id | driver_id | trip_date  | distance_km | fuel_consumed |
+---------+-----------+------------+-------------+---------------+
| 1       | 1         | 2023-02-15 | 120.5       | 10.2          |
| 2       | 1         | 2023-03-20 | 200.0       | 16.5          |
| 3       | 1         | 2023-08-10 | 150.0       | 11.0          |
| 4       | 1         | 2023-09-25 | 180.0       | 12.5          |
| 5       | 2         | 2023-01-10 | 100.0       | 9.0           |
| 6       | 2         | 2023-04-15 | 250.0       | 22.0          |
| 7       | 2         | 2023-10-05 | 200.0       | 15.0          |
| 8       | 3         | 2023-03-12 | 80.0        | 8.5           |
| 9       | 3         | 2023-05-18 | 90.0        | 9.2           |
| 10      | 4         | 2023-07-22 | 160.0       | 12.8          |
| 11      | 4         | 2023-11-30 | 140.0       | 11.0          |
| 12      | 5         | 2023-02-28 | 110.0       | 11.5          |
+---------+-----------+------------+-------------+---------------+

Output:

+-----------+---------------+------------------+-------------------+------------------------+
| driver_id | driver_name   | first_half_avg   | second_half_avg   | efficiency_improvement |
+-----------+---------------+------------------+-------------------+------------------------+
| 2         | Bob Smith     | 11.24            | 13.33             | 2.10                   |
| 1         | Alice Johnson | 11.97            | 14.02             | 2.05                   |
+-----------+---------------+------------------+-------------------+------------------------+

Explanation:

Alice Johnson (driver_id = 1):
- First half trips (Jan-Jun): Feb 15 (120.5/10.2 = 11.81), Mar 20 (200.0/16.5 = 12.12)
- First half average efficiency: (11.81 + 12.12) / 2 = 11.97
- Second half trips (Jul-Dec): Aug 10 (150.0/11.0 = 13.64), Sep 25 (180.0/12.5 = 14.40)
- Second half average efficiency: (13.64 + 14.40) / 2 = 14.02
- Efficiency improvement: 14.02 - 11.97 = 2.05
Bob Smith (driver_id = 2):
- First half trips: Jan 10 (100.0/9.0 = 11.11), Apr 15 (250.0/22.0 = 11.36)
- First half average efficiency: (11.11 + 11.36) / 2 = 11.24
- Second half trips: Oct 5 (200.0/15.0 = 13.33)
- Second half average efficiency: 13.33
- Efficiency improvement: 13.33 - 11.24 = 2.10 (rounded to 2 decimal places)
Drivers not included:
- Carol Davis (driver_id = 3): Only has trips in first half (Mar, May)
- David Wilson (driver_id = 4): Only has trips in second half (Jul, Nov)
- Emma Brown (driver_id = 5): Only has trips in first half (Feb)

The output table is ordered by efficiency improvement in descending order then by name in ascending order.

Solutions

Solution 1: Group Aggregation + Join Query

First, we perform group aggregation on the trips table to calculate the average fuel efficiency for each driver in the first half and the second half of the year.

Then, we join the results with the drivers table, filter out the drivers whose fuel efficiency has improved, and calculate the amount of improvement.

MySQLPandas

# Write your MySQL query statement below
WITH
    T AS (
        SELECT
            driver_id,
            AVG(distance_km / fuel_consumed) half_avg,
            CASE
                WHEN MONTH(trip_date) <= 6 THEN 1
                ELSE 2
            END half
        FROM trips
        GROUP BY driver_id, half
    )
SELECT
    t1.driver_id,
    d.driver_name,
    ROUND(t1.half_avg, 2) first_half_avg,
    ROUND(t2.half_avg, 2) second_half_avg,
    ROUND(t2.half_avg - t1.half_avg, 2) efficiency_improvement
FROM
    T t1
    JOIN T t2 ON t1.driver_id = t2.driver_id AND t1.half < t2.half AND t1.half_avg < t2.half_avg
    JOIN drivers d ON t1.driver_id = d.driver_id
ORDER BY efficiency_improvement DESC, d.driver_name;

import pandas as pd


def find_improved_efficiency_drivers(
    drivers: pd.DataFrame, trips: pd.DataFrame
) -> pd.DataFrame:
    trips = trips.copy()
    trips["trip_date"] = pd.to_datetime(trips["trip_date"])
    trips["half"] = trips["trip_date"].dt.month.apply(lambda m: 1 if m <= 6 else 2)
    trips["efficiency"] = trips["distance_km"] / trips["fuel_consumed"]
    half_avg = (
        trips.groupby(["driver_id", "half"])["efficiency"]
        .mean()
        .reset_index(name="half_avg")
    )
    pivot = half_avg.pivot(index="driver_id", columns="half", values="half_avg").rename(
        columns={1: "first_half_avg", 2: "second_half_avg"}
    )
    pivot = pivot.dropna()
    pivot = pivot[pivot["second_half_avg"] > pivot["first_half_avg"]]
    pivot["efficiency_improvement"] = (
        pivot["second_half_avg"] - pivot["first_half_avg"]
    ).round(2)
    pivot["first_half_avg"] = pivot["first_half_avg"].round(2)
    pivot["second_half_avg"] = pivot["second_half_avg"].round(2)
    result = pivot.reset_index().merge(drivers, on="driver_id")
    result = result.sort_values(
        by=["efficiency_improvement", "driver_name"], ascending=[False, True]
    )
    return result[
        [
            "driver_id",
            "driver_name",
            "first_half_avg",
            "second_half_avg",
            "efficiency_improvement",
        ]
    ]

3601. Find Drivers with Improved Fuel Efficiency

Description

Solutions

Solution 1: Group Aggregation + Join Query

Comments