原创:扣钉日记(微信公众号 ID:codelogs),欢送分享,转载请保留出处。
简介
如果说要给 Linux 文本三剑客 (grep、sed、awk) 增加一员的话,我感觉应该是 jq 命令,因为 jq 命令是用来解决 json 数据的工具,而现如今 json 简直无所不在!
网上的 jq 命令分享文章也不少,但大多介绍得十分浅,jq 的弱小之处齐全没有介绍进去,所以就有了这篇文章,安利一下 jq 这个命令。
根本用法
格式化
# jq 默认的格式化输入
$ echo -n '{"id":1,"name":"zhangsan","score":[75, 85, 90]}'|jq .
{
"id": 1,
"name": "zhangsan",
"score": [
75,
85,
90
]
}
# - c 选项则是压缩到 1 行输入
$ jq -c . <<eof
{
"id": 1,
"name": "zhangsan",
"score": [
75,
85,
90
]
}
eof
{"id":1,"name":"zhangsan","score":[75,85,90]}
属性提取
# 获取 id 字段
$ echo -n '{"id":1,"name":"zhangsan","score":[75, 85, 90]}'|jq '.id'
1
# 获取 name 字段
$ echo -n '{"id":1,"name":"zhangsan","score":[75, 85, 90]}'|jq '.name'
"zhangsan"
# 获取 name 字段,-r 解开字符串引号
$ echo -n '{"id":1,"name":"zhangsan","score":[75, 85, 90]}'|jq -r '.name'
zhangsan
# 多层属性值获取
$ echo -n '{"id":1,"name":"zhangsan","attr":{"height":1.78,"weight":"60kg"}}'|jq '.attr.height'
1.78
# 获取数组中的值
$ echo -n '{"id":1,"name":"zhangsan","score":[75, 85, 90]}'|jq -r '.score[0]'
75
$ echo -n '[75, 85, 90]'|jq -r '.[0]'
75
# 数组截取
$ echo -n '[75, 85, 90]'|jq -r '.[1:3]'
[
85,
90
]
# []开展数组
$ echo -n '[75, 85, 90]'|jq '.[]'
75
85
90
# .. 开展所有构造
$ echo -n '{"id":1,"name":"zhangsan","score":[75, 85, 90]}'|jq -c '..'
{"id":1,"name":"zhangsan","score":[75,85,90]}
1
"zhangsan"
[75,85,90]
75
85
90
# 从非对象类型中提取字段,会报错
$ echo -n '{"id":1,"name":"zhangsan","attr":{"height":1.78,"weight":"60kg"}}'|jq '.name.alias'
jq: error (at <stdin>:0): Cannot index string with string "alias"
# 应用? 号能够防止这种报错
$ echo -n '{"id":1,"name":"zhangsan","attr":{"height":1.78,"weight":"60kg"}}'|jq '.name.alias?'
# // 符号用于,当后面的表达式取不到值时,执行前面的表达式
$ echo -n '{"id":1,"name":"zhangsan","attr":{"height":1.78,"weight":"60kg"}}'|jq '.alias//.name'
"zhangsan"
管道、逗号与括号
# 管道能够将值从前一个命令传送到后一个命令
$ echo -n '{"id":1,"name":"zhangsan","attr":{"height":1.78,"weight":"60kg"}}'|jq '.attr|.height'
1.78
# jq 中做一些根底运算也是能够的
$ echo -n '{"id":1,"name":"zhangsan","attr":{"height":1.78,"weight":"60kg"}}'|jq '.attr|.height*100|tostring +"cm"'"178cm"
# 逗号使得能够执行多个 jq 表达式,使得一个输出可计算出多个输入后果
$ echo 1 | jq '., ., .'
1
1
1
# 括号用于晋升表达式的优先级,如下:逗号优先级低于算术运算
$ $ echo '1'|jq '.+1, .*2'
2
2
$ echo '1'|jq '(.+1, .)*2'
4
2
# 管道优先级低于逗号
$ echo '1'|jq '., .|tostring'
"1"
"1"
$ echo '1'|jq '., (.|tostring)'
1
"1"
了解 jq 执行过程
外表上 jq 是用来解决 json 数据的,但实际上 jq 能解决的是任何 json 根底元素所造成的流,如 integer、string、bool、null、object、array 等,jq 执行过程大抵如下:
- jq 从流中获取一个 json 元素
- jq 执行表达式,表达式生成新的 json 元素
- jq 将新的 json 元素打印输出
能够看看这些示例,如下:
# 这里 jq 实际上将 1 2 3 4 当作 4 个 integer 元素,每找到一个元素就执行 + 1 操作
# jq 实际上是流式解决的,1 2 3 4 能够看成流中的 4 个元素
$ echo '1 2 3 4'|jq '. + 1'
2
3
4
5
# 流中的元素不须要是同种类型,只有是残缺的 json 元素即可
$ jq '"<" + tostring + ">"' <<eof
1
"zhangsan"
true
{"id":1}
[75, 80, 85]
eof
"<1>"
"<zhangsan>"
"<true>"
"<{\"id\":1}>"
"<[75,80,85]>"
# - R 选项可用于将读取到的 json 元素,都当作字符串看待
$ seq 4|jq -R '.'
"1"
"2"
"3"
"4"
# - s 选项将从流中读取到的所有 json 元素,变成一个 json 数组元素
# 这里了解为 jq 从流中只取到了 1 个 json 元素,这个 json 元素的类型是数组
$ seq 4|jq -s .
[
1,
2,
3,
4
]
根底运算
jq 反对 + - * / %
运算,对于 + 号,如果是字符串类型,则是做字符串拼接,如下:
# 做加减乘除运算
$ echo 1|jq '.+1, .-1, .*2, ./2, .%2'
2
0
2
0.5
1
# 赋值运算
$ echo -n '{"id":1,"name":"zhangsan","age":"17","score":"75"}'|jq '.id=2' -c
{"id":2,"name":"zhangsan","age":"17","score":"75"}
数据结构
jq 能够很不便的将其它数据,转化为 json 对象或数组,如下:
# 应用 [] 结构数组元素,- n 通知 jq 没有输出数据,间接执行表达式并生成输入数据
$ jq -n '[1,2,3,4]' -c
[1,2,3,4]
$ cat data.txt
id name age score
1 zhangsan 17 75
2 lisi 16 80
3 wangwu 18 85
4 zhaoliu 18 90
# 每行宰割成数组,[]结构新的数组输入
$ tail -n+2 data.txt|jq -R '[splits("\\s+")]' -c
["1","zhangsan","17","75"]
["2","lisi","16","80"]
["3","wangwu","18","85"]
["4","zhaoliu","18","90"]
$ jq -n '{id:1, name:"zhangsan"}' -c
{"id":1,"name":"zhangsan"}
# 每行转换为对象,{}结构新的对象格局输入
$ tail -n+2 data.txt|jq -R '[splits("\\s+")] | {id:.[0]|tonumber, name:.[1], age:.[2], score:.[3]}' -c
{"id":1,"name":"zhangsan","age":"17","score":"75"}
{"id":2,"name":"lisi","age":"16","score":"80"}
{"id":3,"name":"wangwu","age":"18","score":"85"}
{"id":4,"name":"zhaoliu","age":"18","score":"90"}
# \()字符串占位变量替换
$ cat data.json
{"id":1,"name":"zhangsan","age":"17","score":"75"}
{"id":2,"name":"lisi","age":"16","score":"80"}
{"id":3,"name":"wangwu","age":"18","score":"85"}
{"id":4,"name":"zhaoliu","age":"18","score":"90"}
$ cat data.json |jq '"id:\(.id),name:\(.name),age:\(.age),score:\(.score)"' -r
id:1,name:zhangsan,age:17,score:75
id:2,name:lisi,age:16,score:80
id:3,name:wangwu,age:18,score:85
id:4,name:zhaoliu,age:18,score:90
根底函数
# has 函数,检测对象是否蕴含 key
$ echo -n '{"id":1,"name":"zhangsan","age":"17","score":"75"}'|jq 'has("id")'
true
# del 函数,删除某个属性
$ echo -n '{"id":1,"name":"zhangsan","age":"17","score":"75"}'|jq 'del(.id)' -c
{"name":"zhangsan","age":"17","score":"75"}
# map 函数,对数组中每个元素执行表达式计算,计算结果组织成新数组
$ seq 4|jq -s 'map(. * 2)' -c
[2,4,6,8]
# 下面 map 函数写法,其实等价于这个写法
$ seq 4|jq -s '[.[]|.*2]' -c
[2,4,6,8]
# keys 函数,列出对象属性
$ echo -n '{"id":1,"name":"zhangsan","age":"17","score":"75"}'|jq 'keys' -c
["age","id","name","score"]
# to_entries 函数,列出对象键值对
$ echo -n '{"id":1,"name":"zhangsan","age":"17","score":"75"}'|jq 'to_entries' -c
[{"key":"id","value":1},{"key":"name","value":"zhangsan"},{"key":"age","value":"17"},{"key":"score","value":"75"}]
# length 函数,计算数组或字符串长度
$ jq -n '[1,2,3,4]|length'
4
# add 函数,计算数组中数值之和
$ seq 4|jq -s 'add'
10
# tostring 与 tonumber,类型转换
$ seq 4|jq 'tostring|tonumber'
1
2
3
4
# type 函数,获取元素类型
$ jq 'type' <<eof
1
"zhangsan"
true
null
{"id":1}
[75, 80, 85]
eof
"number"
"string"
"boolean"
"null"
"object"
"array"
过滤、排序、分组函数
$ cat data.json
{"id":1,"name":"zhangsan","sex": 0, "age":"17","score":"75"}
{"id":2,"name":"lisi","sex": 1, "age":"16","score":"80"}
{"id":3,"name":"wangwu","sex": 0, "age":"18","score":"85"}
{"id":4,"name":"zhaoliu","sex": 0, "age":"18","score":"90"}
# select 函数用于过滤,相似 SQL 中的 where
$ cat data.json |jq 'select((.id>1) and (.age|IN("16","17","18")) and (.name !="lisi") or (has("attr")|not) and (.score|tonumber >= 90) )' -c
{"id":3,"name":"wangwu","sex":0,"age":"18","score":"85"}
{"id":4,"name":"zhaoliu","sex":0,"age":"18","score":"90"}
# 有一些简化的过滤函数,如 arrays, objects, iterables, booleans, numbers, normals, finites, strings, nulls, values, scalars
# 它们依据类型过滤,如 objects 过滤出对象,values 过滤出非 null 值等
$ jq -c 'objects' <<eof
1
"zhangsan"
true
null
{"id":1}
[75, 80, 85]
eof
{"id":1}
$ jq -c 'values' <<eof
1
"zhangsan"
true
null
{"id":1}
[75, 80, 85]
eof
1
"zhangsan"
true
{"id":1}
[75,80,85]
# 抉择出 id 与 name 字段,相似 SQL 中的 select id,name
$ cat data.json|jq -s 'map({id,name})[]' -c
{"id":1,"name":"zhangsan"}
{"id":2,"name":"lisi"}
{"id":3,"name":"wangwu"}
{"id":4,"name":"zhaoliu"}
# 提取前 2 行,相似 SQL 中的 limit 2
$ cat data.json|jq -s 'limit(2; map({id,name})[])' -c
{"id":1,"name":"zhangsan"}
{"id":2,"name":"lisi"}
# 依照 age、id 排序,相似 SQL 中的 order by age,id
$ cat data.json|jq -s 'sort_by((.age|tonumber), .id)[]' -c
{"id":2,"name":"lisi","sex":1,"age":"16","score":"80"}
{"id":1,"name":"zhangsan","sex":0,"age":"17","score":"75"}
{"id":3,"name":"wangwu","sex":0,"age":"18","score":"85"}
{"id":4,"name":"zhaoliu","sex":0,"age":"18","score":"90"}
# 依据 sex 与 age 分组,并每组聚合计算 count(*)、avg(score)、max(id)
$ cat data.json |jq -s 'group_by(.sex, .age)[]' -c
[{"id":1,"name":"zhangsan","sex":0,"age":"17","score":"75"}]
[{"id":3,"name":"wangwu","sex":0,"age":"18","score":"85"},{"id":4,"name":"zhaoliu","sex":0,"age":"18","score":"90"}]
[{"id":2,"name":"lisi","sex":1,"age":"16","score":"80"}]
$ cat data.json |jq -s 'group_by(.sex, .age)[]|{sex:.[0].sex, age:.[0].age, count:length, avg_score:map(.score|tonumber)|(add/length), scores:map(.score)|join(","), max_id:map(.id)|max }' -c
{"sex":0,"age":"17","count":1,"avg_score":75,"scores":"75","max_id":1}
{"sex":0,"age":"18","count":2,"avg_score":87.5,"scores":"85,90","max_id":4}
{"sex":1,"age":"16","count":1,"avg_score":80,"scores":"80","max_id":2}
字符串操作函数
# contains 函数,判断是否蕴含,理论也可用于判断数组是否蕴含某个元素
$ echo hello | jq -R 'contains("he")'
true
# 判断是否以 he 结尾
$ echo hello | jq -R 'startswith("he")'
true
# 判断是否以 llo 结尾
$ echo hello | jq -R 'endswith("llo")'
true
# 去掉起始空格
$ echo 'hello'|jq -R 'ltrimstr(" ")|rtrimstr(" ")'
"hello"
# 大小写转换
$ echo hello|jq -R 'ascii_upcase'
"HELLO"
$ echo HELLO|jq -R 'ascii_downcase'
"hello"
# 字符串数组,通过逗号拼接成一个字符串
$ seq 4|jq -s 'map(tostring)|join(",")'
"1,2,3,4"
# json 字符串转换为 json 对象
$ echo -n '{"id":1,"name":"zhangsan","age":"17","attr":"{\"weight\":56,\"height\":178}"}'|jq '.attr = (.attr|fromjson)' -c
{"id":1,"name":"zhangsan","age":"17","attr":{"weight":56,"height":178}}
# json 对象转换为 json 字符串
$ echo -n '{"id":1,"name":"zhangsan","age":"17","attr":{"weight":56,"height":178}}'|jq '.attr = (.attr|tojson)'
{
"id": 1,
"name": "zhangsan",
"age": "17",
"attr": "{\"weight\":56,\"height\":178}"
}
$ cat data.txt
id:1,name:zhangsan,age:17,score:75
id:2,name:lisi,age:16,score:80
id:3,name:wangwu,age:18,score:85
id:4,name:zhaoliu,age:18,score:90
# 正则表达式过滤,jq 应用的是 PCRE
$ cat data.txt|jq -R 'select(test("id:\\d+,name:\\w+,age:\\d+,score:8\\d+"))' -r
id:2,name:lisi,age:16,score:80
id:3,name:wangwu,age:18,score:85
# 正则拆分字符串
$ cat data.txt|jq -R '[splits(",")]' -cr
["id:1","name:zhangsan","age:17","score:75"]
["id:2","name:lisi","age:16","score:80"]
["id:3","name:wangwu","age:18","score:85"]
["id:4","name:zhaoliu","age:18","score:90"]
# 正则替换字符串
$ cat data.txt |jq -R 'gsub("name";"nick")' -r
id:1,nick:zhangsan,age:17,score:75
id:2,nick:lisi,age:16,score:80
id:3,nick:wangwu,age:18,score:85
id:4,nick:zhaoliu,age:18,score:90
# 正则表达式捕捉数据
$ cat data.txt|jq -R 'match("id:(?<id>\\d+),name:(?<name>\\w+),age:\\d+,score:8\\d+")' -cr
{"offset":0,"length":30,"string":"id:2,name:lisi,age:16,score:80","captures":[{"offset":3,"length":1,"string":"2","name":"id"},{"offset":10,"length":4,"string":"lisi","name":"name"}]}
{"offset":0,"length":32,"string":"id:3,name:wangwu,age:18,score:85","captures":[{"offset":3,"length":1,"string":"3","name":"id"},{"offset":10,"length":6,"string":"wangwu","name":"name"}]}
# capture 命名捕捉,生成 key 是捕捉组名称,value 是捕捉值的对象
$ cat data.txt|jq -R 'capture("id:(?<id>\\d+),name:(?<name>\\w+),age:\\d+,score:8\\d+")' -rc
{"id":"2","name":"lisi"}
{"id":"3","name":"wangwu"}
# 正则扫描输出字符串
$ cat data.txt|jq -R '[scan("\\w+:\\w+")]' -rc
["id:1","name:zhangsan","age:17","score:75"]
["id:2","name:lisi","age:16","score:80"]
["id:3","name:wangwu","age:18","score:85"]
["id:4","name:zhaoliu","age:18","score:90"]
日期函数
# 以后工夫缀
$ jq -n 'now'
1653820640.939947
# 将工夫缀转换为 0 时区的合成工夫(broken down time),模式为 年 月 日 时 分 秒 dayOfWeek dayOfYear
$ jq -n 'now|gmtime' -c
[2022,4,29,10,45,5.466768980026245,0,148]
# 将工夫缀转换为本地时区的合成工夫(broken down time)
$ jq -n 'now|localtime' -c
[2022,4,29,18,46,5.386353015899658,0,148]
# 合成工夫转换为工夫串
$ jq -n 'now|localtime|strftime("%Y-%m-%dT%H:%M:%S")' -c
"2022-05-29T18:50:33"
# 与下面等效
$ jq -n 'now|strflocaltime("%Y-%m-%dT%H:%M:%SZ")'
"2022-05-29T19:00:40Z"
# 工夫串解析为合成工夫
$ date +%FT%T|jq -R 'strptime("%Y-%m-%dT%H:%M:%S")' -c
[2022,4,29,18,51,27,0,148]
# 合成工夫转换为工夫缀
$ date +%FT%T|jq -R 'strptime("%Y-%m-%dT%H:%M:%S")|mktime'
1653850310
高级用法
实际上 jq 是一门脚本语言,它也反对变量、分支构造、循环构造与自定义函数,如下:
$ cat data.json
{"id":1,"name":"zhangsan","sex": 0, "age":"17","score":"75"}
{"id":2,"name":"lisi","sex": 1, "age":"16","score":"80"}
{"id":3,"name":"wangwu","sex": 0, "age":"18","score":"85"}
{"id":4,"name":"zhaoliu","sex": 0, "age":"18","score":"90"}
# 单变量定义
$ cat data.json| jq '.id as $id|$id'
1
2
3
4
# 对象展开式变量定义
$ cat data.json |jq '. as {id:$id,name:$name}|"id:\($id),name:\($name)"'"id:1,name:zhangsan"
"id:2,name:lisi"
"id:3,name:wangwu"
"id:4,name:zhaoliu"
$ cat data.json
["1","zhangsan","17","75"]
["2","lisi","16","80"]
["3","wangwu","18","85"]
["4","zhaoliu","18","90"]
# 数组展开式变量定义
$ cat data.json|jq '. as [$id,$name]|"id:\($id),name:\($name)"'"id:1,name:zhangsan"
"id:2,name:lisi"
"id:3,name:wangwu"
"id:4,name:zhaoliu"
# 分支构造
$ cat data.json|jq '. as [$id,$name]|if ($id>"1") then"id:\($id),name:\($name)"else empty end'
"id:2,name:lisi"
"id:3,name:wangwu"
"id:4,name:zhaoliu"
# 循环构造,第一个表达式条件满足时,执行只每二个表达式
# 循环构造除了 while,还有 until、recurse 等
$ echo 1|jq 'while(.<100; .*2)'
1
2
4
8
16
32
64
# 自定义计算 3 次方的函数
$ echo 2|jq 'def cube: .*.*. ; cube'
8
因为这些高级个性并不罕用,这里仅给出了一些简略示例,具体应用能够 man jq
查看。
辅助 shell 编程
相熟 shell 脚本编程的同学都晓得,shell 自身是没有提供 Map、List 这种数据结构的,这导致应用 shell 实现某些性能时,变得很辣手。
但 jq 自身是解决 json 的,而 json 中的对象就可等同于 Map,json 中的数组就可等同于 List,如下:
list='[]';
#List 增加元素
list=$(echo "$list"|jq '. + [ $val]' --arg val java);
list=$(echo "$list"|jq '. + [ $val]' --arg val shell);
#获取 List 大小
echo "$list"|jq '.|length'
#获取 List 第 1 个元素
echo "$list"|jq '.[0]' -r
# List 是否蕴含 java 字符串
echo "$list"|jq 'any(.=="java")'
#删除 List 第 1 个元素
list=$(echo "$list"|jq 'del(.[0])');
# List 合并
list=$(echo "$list"|jq '. + $val' --argjson val '["shell","python"]');
# List 截取
echo "$list"|jq '.[1:3]'
# List 遍历
for o in $(echo "$list" | jq -r '.[]');do
echo "$o";
done
map='{}';
#Map 增加元素
map=$(echo "$map"|jq '.id=$val' --argjson val 1)
map=$(echo "$map"|jq '.courses=$val' --argjson val "$list")
#获取 Map 大小
echo "$map"|jq '.|length'
#获取 Map 指定 key 的值
echo "$map"|jq '.id' -r
#判断 Map 指定 key 是否存在
echo "$map" | jq 'has("id")'
#删除 Map 指定 key
map=$(echo "$map"|jq 'del(.id)')
# Map 合并
map=$(echo "$map"|jq '. + $val' --argjson val '{"code":"ID001","name":"hello"}')
# Map 的 KeySet 遍历
for key in $(echo "$map" | jq -r 'keys[]'); do
value=$(jq '.[$a]' --arg a "$key" -r <<<"$map");
printf "%s:%s\n" "$key" "$value";
done
# Map 的 entrySet 遍历
while read -r line; do
key=$(jq '.key' -r <<<"$line");
value=$(jq '.value' -r <<<"$line");
printf "%s:%s\n" "$key" "$value";
done <<<$(echo "$map" | jq 'to_entries[]' -c)
总结
能够发现,jq 曾经实现了 json 数据处理与剖析的方方面面,我集体最近在工作中,也屡次应用 jq 来剖析调用日志等,用起来的确十分不便。
如果你当初还没齐全学会 jq 的用法,没关系,倡议先珍藏起来,前面肯定会用失去的!
往期内容
密码学入门
q 命令 - 用 SQL 剖析文本文件
神秘的 backlog 参数与 TCP 连贯队列
mysql 的 timestamp 会存在时区问题?
真正了解可反复读事务隔离级别
字符编码解惑