关于hive:Hive基础知识之顾客访问店铺数据分析UVTop3

52次阅读

共计 942 个字符,预计需要花费 3 分钟才能阅读完成。

已知顾客拜访店铺的拜访日志

user_id shop
u1 a
u2 b
u1 b
u1 a
u3 c
u4 b
u1 a
u2 c
u5 b
u4 b
u6 c
u2 c
u1 b
u2 a
u2 a
u3 a
u5 a
u5 a
u5 a

建表、导表

create table visit(user_id string,shop string) row format delimited fields terminated by '\t';
load data local inpath '/data/visit.dat' into table visit;

求每个店铺的 UV(访客数)

select shop , count(distinct user_id) as uv from visit group by shop;
shop UV
shop uv
a 4
b 4
c 3

求每个店铺拜访次数 top3 的访客信息。输入店铺名称、访客 id、拜访次数

1. 先求出每个店铺,每个访客的拜访次数

select shop, user_id, count(*) as ct from visit group by shop, user_id;
shop user_id ct
a u1 3
b u1 2
a u2 2
b u2 1
c u2 2
a u3 1
c u3 1
b u4 2
a u5 3
b u5 1
c u6 1

2. 计算每个店铺被用户拜访次数排名

select shop,user_id,ct,rank() over(partition by shop order by ct) rk
from (select shop, user_id, count(*) as ct from visit group by shop, user_id) as t1;
shop user_id ct rk
a u3 1 1
a u2 2 2
a u5 3 3
a u1 3 3
b u5 1 1
b u2 1 1
b u4 2 3
b u1 2 3
c u6 1 1
c u3 1 1
c u2 2 3

3. 取每个店铺的前三名

select shop, user_id, ct from
(select shop,user_id,ct,rank() over(partition by shop order by ct) rk
from (select shop, user_id, count(*) as ct from visit group by shop, user_id) as t1) as t2 where rk <= 3;
shop user_id ct
a u3 1
a u2 2
a u1 3
a u5 3
b u2 1
b u5 1
b u1 2
b u4 2
c u3 1
c u6 1
c u2 2

The End

正文完
 0