问题
现在有一张商品入库表,包括商品id、商品成本和入库日期3个字段,由于某些原因,导致部分商品的成本缺失(为0或者没有值都是缺失),这样不利于我们计算成本。现在要把缺失的商品进价补充完整,补充的依据是使用相同商品的最近一次有效成本作为当前商品的成本。比如2023-11-04号101商品的cost就需要用300.39填充。
product_id cost date
101 300.39 2023-11-01
102 500 2023-11-02
101 0 2023-11-03
101 2023-11-04
102 600 2023-11-04
102 2023-11-05
103 983 2023-11-06
解答
使用last_value()函数,传入第二参数true来跳过null值,当然首先要对数据处理下,将0的变成null。
with temp as
(
select 101 as product_id,300.39 as cost,'2023-11-01' as date
union all
select 102 as product_id,500 as cost,'2023-11-02' as date
union all
select 101 as product_id,0 as cost,'2023-11-03' as date
union all
select 101 as product_id,null as cost,'2023-11-04' as date
union all
select 102 as product_id,600 as cost,'2023-11-04' as date
union all
select 102 as product_id,null as cost,'2023-11-05' as date
union all
select 103 as product_id,983 as cost,'2023-11-06' as date
)
select
product_id
-- last_value第二个参数设置为true,表示取最后一个值时跳过null值。默认不写为false
,last_value(cost,true) over(partition by product_id order by date) as valid_last_value
,date
from(
select
product_id
,if(cost=0 or cost is null,null,cost) as cost --当cost为0或null时就给null值
,date
from temp
) t1;
Comments NOTHING