一、获取以后工夫
- current_date获取以后日期
2018-04-09 - current_timestamp/now()获取以后工夫
2018-04-09 15:20:49.247
二、从日期工夫中提取字段
- year,month,day/dayofmonth,hour,minute,second
Examples:
> SELECT day('2009-07-30'); 30` * 1* 2
- dayofweek (1 = Sunday, 2 = Monday, …, 7 = Saturday),dayofyear
Examples:
> SELECT dayofweek('2009-07-30'); 5` * 1* 2
- weekofyear
weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.
Examples:
> SELECT weekofyear('2008-02-20'); 8` * 1* 2
- trunc截取某局部的日期,其余局部默认为01
第二个参数 [“year”, “yyyy”, “yy”, “mon”, “month”, “mm”]
Examples:
> SELECT trunc('2009-02-12', 'MM'); 2009-02-01 > SELECT trunc('2015-10-27', 'YEAR'); 2015-01-01` * 1* 2* 3* 4
- date_trunc [“YEAR”, “YYYY”, “YY”, “MON”, “MONTH”, “MM”, “DAY”, “DD”, “HOUR”, “MINUTE”, “SECOND”, “WEEK”, “QUARTER”]
Examples:
> SELECT date_trunc('2015-03-05T09:32:05.359', 'HOUR'); 2015-03-05T09:00:00` * 1* 2
- date_format将工夫转化为某种格局的字符串
Examples:
> SELECT date_format('2016-04-08', 'y'); 2016` * 1* 2
三、日期工夫转换
- unix_timestamp返回以后工夫的unix工夫戳
Examples:
> SELECT unix_timestamp(); 1476884637 > SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd'); 1460041200` * 1* 2
- from_unixtime将工夫戳换算成以后工夫,to_unix_timestamp将工夫转化为工夫戳
Examples:
> SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss'); 1970-01-01 00:00:00 > SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd'); 1460041200` * 1* 2
- to_date/date将字符串转化为日期格局,to_timestamp(Since: 2.2.0)
> SELECT to_date('2009-07-30 04:17:52'); 2009-07-30 > SELECT to_date('2016-12-31', 'yyyy-MM-dd'); 2016-12-31 > SELECT to_timestamp('2016-12-31 00:12:00'); 2016-12-31 00:12:00` * 1* 2* 3
- quarter 将1年4等分(range 1 to 4)
Examples:
> SELECT quarter('2016-08-31'); 3` * 1* 2
四、日期、工夫计算
- months_between两个日期之间的月数
months_between(timestamp1, timestamp2) - Returns number of months between timestamp1 and timestamp2.
Examples:
> SELECT months_between('1997-02-28 10:30:00', '1996-10-30'); 3.94959677` * 1* 2
- add_months返回日期后n个月后的日期
Examples:
> SELECT add_months('2016-08-31', 1); 2016-09-30` * 1* 2* 3
- last_day(date),next_day(start_date, day_of_week)
Examples:
> SELECT last_day('2009-01-12'); 2009-01-31 > SELECT next_day('2015-01-14', 'TU'); 2015-01-20` * 1* 2
- date_add,date_sub(减)
date_add(start_date, num_days) - Returns the date that is num_days after start_date.
Examples:
> SELECT date_add('2016-07-30', 1); 2016-07-31` * 1
- datediff(两个日期间的天数)
datediff(endDate, startDate) - Returns the number of days from startDate to endDate.
Examples:
> SELECT datediff('2009-07-31', '2009-07-30'); 1` * 1* 2
- 对于UTC工夫
- to_utc_timestamp
to_utc_timestamp(timestamp, timezone) - Given a timestamp like ‘2017-07-14 02:40:00.0’, interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, ‘GMT+1’ would yield ‘2017-07-14 01:40:00.0’.
Examples:
> SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul'); 2016-08-30 15:00:00` * 1* 2
- from_utc_timestamp
from_utc_timestamp(timestamp, timezone) - Given a timestamp like ‘2017-07-14 02:40:00.0’, interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, ‘GMT+1’ would yield ‘2017-07-14 03:40:00.0’.
Examples:
> SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul'); 2016-08-31 09:00:00
五、Hive罕用函数
1.数学函数
round(DOUBLE d) : 返回DOUBLE型的d的BIGINT类型的近似值round(DOUBLE d,INT) : 返回DOUBLE型的d的保留n位小数的DOUBLE类型的近似值floor(DOUBLE d): d是DOUBLE类型的,返回<=d的最大的BIGINT值ceil(DOUBLE d): d是DOUBLE类型的,返回>=d的最小的BIGINT值rand() rand(INT seed): 每行返回一个DOUBLE型的随机数,整数seed是随机因子exp(DOUBLE d):返回e的d幂次方ln(DOUBLE d): 以自然数为底d的对数log10(DOUBLE d): 以10为底的d的对数log2(DOUBLE d): 以2为底的d的对数log(DOUBLE base,DOUBLE d)以base为底的d的对数pow(DOUBLE d,DOUBLE p) power(DOUBLE d,DOUBLE p): 计算d的p次幂sqrt(DOUBLE d): d的平方根bin(DOUBLE i): 计算二进制值i的STRING类型值hex(BIGINT i): 计算十六进制值i的STRING类型值unhex(STRING i): hex的逆办法conv(STRING num,INT from_base,INT to_base): 将STRING类型的num 从from_base进制转化为to_base 进制abs(DOUBLE d): 计算d的绝对值pmod(INT i1 ,INT i2): i1对i2取模sin(DOUBLE d): 返回d的正弦值cos(DOUBLE d): 返回d的余弦值asin(DOUBLE d): 返回d的反正弦值acos(DOUBLE d): 返回d的反余弦值tan(DOUBLE d): 返回d的正切值atan(DOUBLE d): 返回d的反正切值degrees(DOUBLE d): 将弧度值d转化为角度值radians(DOUBLE d): 将角度值d转化为弧度值positive(DOUBLE d): 返回+dnegative(DOUBLE d): 返回-dsign(DOUBLE d): 如果d是负数,则返回+1.0,如果d是正数,则返回-1.0,否则为0
2.日期函数
to_date(string timestamp):返回工夫字符串中的日期局部,如to_date('1970-01-01 00:00:00')='1970-01-01'current_date:返回以后日期year(date):返回日期date的年,类型为int如year('2019-01-01')=2019month(date):返回日期date的月,类型为int,如month('2019-01-01')=1day(date): 返回日期date的天,类型为int,如day('2019-01-01')=1weekofyear(date1):返回日期date1位于该年第几周。如weekofyear('2019-03-06')=10datediff(date1,date2):返回日期date1与date2相差的天数,如datediff('2019-03-06','2019-03-05')=1date_add(date1,int1):返回日期date1加上int1的日期,如date_add('2019-03-06',1)='2019-03-07'date_sub(date1,int1):返回日期date1减去int1的日期,如date_sub('2019-03-06',1)='2019-03-05'months_between(date1,date2):返回date1与date2相差月份,如months_between('2019-03-06','2019-01-01')=2add_months(date1,int1):返回date1加上int1个月的日期,int1可为正数。如add_months('2019-02-11',-1)='2019-01-11'last_day(date1):返回date1所在月份最初一天。如last_day('2019-02-01')='2019-02-28'next_day(date1,day1):返回日期date1的下个星期day1的日期。day1为星期X的英文前两字母如next_day('2019-03-06','MO') 返回'2019-03-11'trunc(date1,string1):返回日期最开始年份或月份。string1可为年(YYYY/YY/YEAR)或月(MONTH/MON/MM)。如trunc('2019-03-06','MM')='2019-03-01',trunc('2019-03-06','YYYY')='2019-01-01'unix_timestamp():返回以后工夫的unix工夫戳,可指定日期格局。如unix_timestamp('2019-03-06','yyyy-mm-dd')=1546704180from_unixtime():返回unix工夫戳的日期,可指定格局。如select from_unixtime(unix_timestamp('2019-03-06','yyyy-mm-dd'),'yyyymmdd')='20190306'
3.条件函数
if(boolean,t1,t2):若布尔值成立,则返回t1,反正返回t2。如if(1>2,100,200)返回200case when boolean then t1 else t2 end:若布尔值成立,则t1,否则t2,可加多重判断coalesce(v0,v1,v2):返回参数中的第一个非空值,若所有值均为null,则返回null。如coalesce(null,1,2)返回1isnull(a):若a为null则返回true,否则返回false
4.字符串函数
length(string1):返回字符串长度concat(string1,string2):返回拼接string1及string2后的字符串concat_ws(sep,string1,string2):返回按指定分隔符拼接的字符串lower(string1):返回小写字符串,同lcase(string1)。upper()/ucase():返回大写字符串trim(string1):去字符串左右空格,ltrim(string1):去字符串左空格。rtrim(string1):去字符串右空格repeat(string1,int1):返回反复string1字符串int1次后的字符串reverse(string1):返回string1反转后的字符串。如reverse('abc')返回'cba'rpad(string1,len1,pad1):以pad1字符右填充string1字符串,至len1长度。如rpad('abc',5,'1')返回'abc11'。lpad():左填充split(string1,pat1):以pat1正则分隔字符串string1,返回数组。如split('a,b,c',',')返回["a","b","c"]substr(string1,index1,int1):以index地位起截取int1个字符。如substr('abcde',1,2)返回'ab'
5.聚合函数
count():统计行数sum(col1):统计指定列和avg(col1):统计指定列平均值min(col1):返回指定列最小值max(col1):返回指定列最大值
6.表生成函数
explode (array):返回多行array中对应的元素。如explode(array('A','B','C'))返回 explode(map):返回多行map键值对对应元素。如explode(map(1,'A',2,'B',3,'C'))返回
7.窗口函数
row_number() over(partitiion by .. order by .. ):依据partition排序,雷同值取不同序号,不存在序号跳跃rank() over(partition by .. order by .):依据partition排序,雷同值取雷同序号,存在序号跳跃dense_rank() over(partition by .. order by ..):依据partition排序,雷同值取雷同序号,不存在序号跳跃sum() over(partition by .. order by ..)count() over(partition by .. order by ..)lag(col,n) over(partition by .. order by ..) :查看以后行的上第n行lead(col,n) over(partition by .. order by ..):查看以后行的下第n行first_value() over(partition by .. order by ..):满足partition及排序的第一个值last_value() over(partition by .. order by ..):满足partition及排序的最初值ntile(n) over(partition by .. order by ..):满足partition及排序的数据分成n份partition内更细的划分,可应用windows子句。常见子句为:preceding:往前following:往后current row:以后行unbounded:终点,unbounded preceding 示意从后面的终点, unbounded following:示意到前面的起点应用如:、sum(col) over(partition by .. order by .. rows between 1 preceding and current row):以后行与前一行做聚合
8.行列转换
concat_ws(sep, collect_set(col1)) :同组不同行合并成一列,以sep分隔符分隔。collect_set在无反复的状况下也能够collect_list()代替。collect_set()去重,collect_list()不去重lateral view explode(split(col1,',')) :同组同列的数据拆分成多行,以sep分隔符辨别