返回题库

增长型基金

Growing Fund

专题
Data Wrangling / 数据处理
难度
L4

题目详情

一家数十亿美元规模的对冲基金有许多子基金,每只基金都有自己的投资逻辑。一些投资者会根据自己对基金投资逻辑有效性的判断,把资金配置到特定基金。

根据下面的表,对每只基金找出其在 2023 年每月新增投资者人数的环比增幅最大的一次。输出基金名称、增幅最大的月份、上个月的投资者人数、当月投资者人数,以及百分比变化。结果按投资者增幅百分比降序排列;若相同,则按基金名称升序排列。

指南

  1. 对于一月,新增投资者人数的百分比变化应为 null。
  2. 如果某只基金有两个月的百分比变化完全相同,选择较早的月份。
  3. 百分比变化四舍五入到小数点后两位。

示例输出

fund_name month curr_mon_investor_cnt prev_mon_investor_cnt perc_change
Alpha Fund Feb 13 4 225%
Beta Fund Mar 73 62 17.74%
Delta Fund Dec 44 42 4.76%

输入表

`funds`

Column Name Data Type
fund_id bigint
fund_name text
fund_manager_name text
fund_established_dt datetime

`fund_investors`

Column Name Data Type
investor_id bigint
fund_id bigint
initial_investment_amt float
initial_investment_dt datetime

A multi-billion dollar hedge fund has many subsidiary funds that each have their own investment thesis. Certain investors like to allocate their capital to specific funds based on their opinion on the efficacy of the fund's investment thesis.

Using the tables below, for each fund find the largest percent change in the month over month number of investors that the fund acquired in 2023. Output the fund's name, month with the largest percent increase, previous month's number of investors, current month's number of investors, and the percent change. Order the resulting data by the percent change in investors in descending order and then the fund's name in ascending order.

Guidelines

  1. For January, the percent change in how many investors the fund acquired should be null.
  2. If a fund has two months with an identical percent change in the number of investors, choose the percent increase that corresponds to the earlier month.
  3. Round the percent change to two decimals.

Example Output

fund_name month curr_mon_investor_cnt prev_mon_investor_cnt perc_change
Alpha Fund Feb 13 4 225%
Beta Fund Mar 73 62 17.74%
Delta Fund Dec 44 42 4.76%

Input Tables

`funds`

Column Name Data Type
fund_id bigint
fund_name text
fund_manager_name text
fund_established_dt datetime

`fund_investors`

Column Name Data Type
investor_id bigint
fund_id bigint
initial_investment_amt float
initial_investment_dt datetime
解析

这道题的解法可以拆成五部分:

  1. 第 1 步:找出 2023 年每只基金每个月新增投资者的人数(见 cte `investors_per_fund_per_month`)。
  2. 第 2 步:对上一步的每条记录,找出该基金前一个月新增的投资者人数(见 cte `investors_per_fund_per_month_and_prev_month`)。
  3. 第 3 步:计算本月相对于上月新增投资者人数的百分比变化(见 cte `perc_change_investors_per_fund_per_month`)。
  4. 第 4 步:对每只基金的每个月按新增投资者人数的环比变化进行排名。也就是说,为新增投资者增幅最大的月份赋予排名 1(见 cte `top_perc_change_in_investors_per_fund`)。
  5. 第 5 步:连接 `funds` 表,用基金名称替代基金 id。
WITH investors_per_fund_per_month AS (
    SELECT 
        fi.fund_id, 
        DATE_FORMAT(fi.initial_investment_dt, '%b') AS investment_month,
        COUNT(DISTINCT fi.investor_id) AS investors
    FROM    
        fund_investors fi 
    WHERE 
        EXTRACT(YEAR FROM fi.initial_investment_dt) = 2023
    GROUP BY 
        1, 2
), investors_per_fund_per_month_and_prev_month AS (
    SELECT 
        ipf.fund_id, 
        ipf.investment_month,
        ipf.investors AS curr_month_investors,
        LAG(ipf.investors, 1) OVER(
            PARTITION BY ipf.fund_id 
            ORDER BY ipf.investment_month ASC
        ) AS prev_month_investors
    FROM 
        investors_per_fund_per_month ipf     
), perc_change_investors_per_fund_per_month AS (
    SELECT 
        ipf.fund_id,
        ipf.investment_month,
        ipf.curr_month_investors,
        ipf.prev_month_investors,
        ROUND(
           100.0 * (ipf.curr_month_investors - ipf.prev_month_investors) / ipf.prev_month_investors
        , 2) AS perc_change_investors
    FROM 
        investors_per_fund_per_month_and_prev_month ipf 
), top_perc_change_in_investors_per_fund AS (
    SELECT 
        pci.fund_id,
        pci.investment_month,
        pci.curr_month_investors,
        pci.prev_month_investors,
        pci.perc_change_investors,
        DENSE_RANK() OVER(
            PARTITION BY pci.fund_id 
            ORDER BY pci.perc_change_investors DESC, pci.investment_month ASC
        ) as rnk
    FROM 
        perc_change_investors_per_fund pci
)

SELECT 
    f.fund_name,
    tpc.investment_month AS "month",
    tpc.curr_month_investors AS "curr_mon_investor_cnt",
    tpc.prev_month_investors AS "prev_mon_investor_cnt",
    tpc.perc_change_investors || "%" AS "perc_change"
FROM 
    top_perc_change_in_investors_per_fund tpc 
INNER JOIN 
    funds f 
    ON tpc.fund_id = f.fund_id
WHERE 
    tpc.rnk = 1
ORDER BY 
    tpc.perc_change_investors DESC, tpc.fund_name ASC

Original Explanation

The solution to this problem can be dissected into five parts:

  1. Part 1: Find the number of new investors for each fund per month in the year 2023 (See cte `investors_per_fund_per_month`).
  2. Part 2: For each of the previous records, find the number of new investors that had joined the fund in the previous month (See cte `investors_per_fund_per_month_and_prev_month`).
  3. Part 3: Calculate the percent difference in the number of investors the firm acquired this month versus the previous month (See cte `perc_change_investors_per_fund_per_month`)
  4. Part 4: For each fund, assign a ranking for each month based on the month over month % change in the number of investors the firm acquired. In essence, this assigns a ranking of 1 to the month that corresponds to the highest % increase in new investors to that fund (See cte `top_perc_change_in_investors_per_fund`)
  5. Part 5: Join the `funds` table to get the name of the fund rather than the fund's id.
WITH investors_per_fund_per_month AS (
    SELECT 
        fi.fund_id, 
        DATE_FORMAT(fi.initial_investment_dt, '%b') AS investment_month,
        COUNT(DISTINCT fi.investor_id) AS investors
    FROM    
        fund_investors fi 
    WHERE 
        EXTRACT(YEAR FROM fi.initial_investment_dt) = 2023
    GROUP BY 
        1, 2
), investors_per_fund_per_month_and_prev_month AS (
    SELECT 
        ipf.fund_id, 
        ipf.investment_month,
        ipf.investors AS curr_month_investors,
        LAG(ipf.investors, 1) OVER(
            PARTITION BY ipf.fund_id 
            ORDER BY ipf.investment_month ASC
        ) AS prev_month_investors
    FROM 
        investors_per_fund_per_month ipf     
), perc_change_investors_per_fund_per_month AS (
    SELECT 
        ipf.fund_id,
        ipf.investment_month,
        ipf.curr_month_investors,
        ipf.prev_month_investors,
        ROUND(
           100.0 * (ipf.curr_month_investors - ipf.prev_month_investors) / ipf.prev_month_investors
        , 2) AS perc_change_investors
    FROM 
        investors_per_fund_per_month_and_prev_month ipf 
), top_perc_change_in_investors_per_fund AS (
    SELECT 
        pci.fund_id,
        pci.investment_month,
        pci.curr_month_investors,
        pci.prev_month_investors,
        pci.perc_change_investors,
        DENSE_RANK() OVER(
            PARTITION BY pci.fund_id 
            ORDER BY pci.perc_change_investors DESC, pci.investment_month ASC
        ) as rnk
    FROM 
        perc_change_investors_per_fund_per_month pci
)

SELECT 
    f.fund_name,
    tpc.investment_month AS "month",
    tpc.curr_month_investors AS "curr_mon_investor_cnt",
    tpc.prev_month_investors AS "prev_mon_investor_cnt",
    tpc.perc_change_investors || "%" AS "perc_change"
FROM 
    top_perc_change_in_investors_per_fund tpc 
INNER JOIN 
    funds f 
    ON tpc.fund_id = f.fund_id
WHERE 
    tpc.rnk = 1
ORDER BY 
    tpc.perc_change_investors DESC, tpc.fund_name ASC