游牧经理人
Nomadic Manager
题目详情
在纽约市,有一群特别喜欢旅行的对冲基金经理,他们经常为了商务会议和其他管理事务在各州之间飞来飞去。
给定一张这些经理人乘坐航班的表,找出每一对城市之间飞行时长第二长的那趟航班。结果按 flight id 升序排列。
注意:对于任意两个城市 A 和 B,`(takeoff_location=A, arrival_location=B)` 和 `(takeoff_location=B, arrival_location=A)` 视为同一对城市。
示例输出
| id | takeoff_location | arrival_location | flight_start | flight_end |
|---|---|---|---|---|
| 1 | New York City | Boston | 2023-05-01T12:00:00 | 2023-05-01T16:00:00 |
| 2 | Seattle | San Francisco | 2023-05-03T1:00:00 | 2023-05-05T6:00:00 |
| 3 | Orlando | Boston | 2023-05-05T18:00:00 | 2023-05-05T22:00:00 |
输入表
`flights`
| Column Name | Data Type |
|---|---|
| id | bigint |
| takeoff_location | text |
| arrival_location | text |
| plane_id | bigint |
| flight_start | datetime |
| fligth_end | datetime |
In New York City, there is a select group of hedge fund managers that cherish any opportunity they get to travel. As a result, they often fly from state to state to conduct business meetings and other managerial activities.
Given a table of flights that the managers took, find the second flight with the largest travel duration between each pair of cities. Order the flights by the flight id in ascending order.
Note: For any two cities A and B, `(takeoff_location=A, arrival_location=B)` and `(takeoff_location=B, arrival_location=A)` are counted as the same pair of cities.
Example Output
| id | takeoff_location | arrival_location | flight_start | flight_end |
|---|---|---|---|---|
| 1 | New York City | Boston | 2023-05-01T12:00:00 | 2023-05-01T16:00:00 |
| 2 | Seattle | San Francisco | 2023-05-03T1:00:00 | 2023-05-05T6:00:00 |
| 3 | Orlando | Boston | 2023-05-05T18:00:00 | 2023-05-05T22:00:00 |
Input Tables
`flights`
| Column Name | Data Type |
|---|---|
| id | bigint |
| takeoff_location | text |
| arrival_location | text |
| plane_id | bigint |
| flight_start | datetime |
| fligth_end | datetime |
解析
解这道题的第一步,是先计算每趟航班的飞行时长,并为每条航班记录创建一个 key。创建这个 key 的目的是把发生在同一对城市之间的航班归为同一组,而不区分哪座城市是起飞地、哪座城市是到达地。比如,我们希望 `takeoff = New York`、`arrival = Boston` 和 `arrival = New York`、`takeoff = Boston` 被视为同一对城市。实现这个目的的 key,可以把起飞地和到达地中按字母序靠前的城市放在前面,另一个放在后面,再拼接起来。
现在我们已经有了每趟航班的飞行时长,以及标识某对城市之间航班的 key,就可以基于飞行时长对每组城市对中的航班进行排名。由于我们需要的是每对城市之间第二长的航班,所以只需筛选排名等于 2 的记录。
WITH flight_details AS (
SELECT
f.*,
EXTRACT(EPOCH FROM (f.flight_end - f.flight_start)) AS flight_duration,
CASE
WHEN f.takeoff_location > f.arrival_location
THEN f.takeoff_location || '-' || f.arrival_location
ELSE f.arrival_location || '-' || f.takeoff_location
END AS flight_key
FROM
flights f
), flight_duration_rankings AS (
SELECT
fd.*,
RANK() OVER (PARTITION BY fd.flight_key ORDER BY fd.flight_duration DESC) AS rnk
FROM
flight_details fd
)
SELECT
fdr.id,
fdr.arrival_location,
fdr.takeoff_location,
fdr.flight_start,
fdr.flight_end
FROM
flight_duration_rankings fdr
WHERE
rnk = 2
ORDER BY
fdr.id ASCOriginal Explanation
The first step in solving this question involves calculating the duration of each flight and creating a key that will be associated with each flight. The rationale for creating the key is to group flights that take place between pairs of cities as one pair, regardless of which city is the arrival city and which city is the takeoff city. For example, we want to count `takeoff = New York`, `arrival = Boston` to be the same as `arrival = New York`, `takeoff = Boston`. The key that accomplishes this just combines the takeoff and arrival cities by placing the city that comes first alphabetically at the start of the key and other city at the end of the key.
Now that we have each flight duration and a key that identifies flights between a certain pair of cities, we can assign a ranking to each flight pair based on the magnitude of its flight duration. Since we want the second longest flight between each pair of cities we filter for the records where the ranking equals 2.
WITH flight_details AS (
SELECT
f.*,
EXTRACT(EPOCH FROM (f.flight_end - f.flight_start)) AS flight_duration,
CASE
WHEN f.takeoff_location > f.arrival_location
THEN f.takeoff_location || '-' || f.arrival_location
ELSE f.arrival_location || '-' || f.takeoff_location
END AS flight_key
FROM
flights f
), flight_duration_rankings AS (
SELECT
fd.*,
RANK() OVER (PARTITION BY fd.flight_key ORDER BY fd.flight_duration DESC) AS rnk
FROM
flight_details fd
)
SELECT
fdr.id,
fdr.arrival_location,
fdr.takeoff_location,
fdr.flight_start,
fdr.flight_end
FROM
flight_duration_rankings fdr
WHERE
rnk = 2
ORDER BY
fdr.id ASC