1、背景
此篇文档简略的记录一下在 es
应用 bucket script
来进行聚合的一个例子。
2、需要
假如咱们有一个简略的卖车数据,记录每个月 month
在卖了 brand
品牌的车 salesVolume
的数量。
此处咱们须要聚合出 每个月 brand= 宝马
的车在 每个月
的销售占比
3、筹备数据
3.1 mapping
PUT /index_bucket_script
{
"mappings": {
"properties": {
"month": {"type": "keyword"},
"brand": {"type": "keyword"},
"salesVolume": {"type": "integer"}
}
}
}
3.2 插入数据
PUT /index_bucket_script/_bulk
{"index":{"_id":1}}
{"month":"2023-01","brand":"宝马","salesVolume":100}
{"index":{"_id":3}}
{"month":"2023-02","brand":"公众","salesVolume":80}
{"index":{"_id":4}}
{"month":"2023-02","brand":"宝马","salesVolume":20}
留神:
此处 2023-02
月份的数据插入了 2
个品牌的数据。
4、bucket_script 聚合的语法
5、聚合
5.1 依据月份分组排序
GET index_bucket_script/_search
{
"query": {"match_all": {}
},
"size": 0,
"aggs": {
"依据月份分组": {
"terms": {
"field": "month",
"order": {"_key": "asc"}
}
}
}
}
5.2 统计每个月卖了多少辆车
GET index_bucket_script/_search
{
"query": {"match_all": {}
},
"size": 0,
"aggs": {
"依据月份分组": {
"terms": {
"field": "month",
"order": {"_key": "asc"}
},
"aggs": {
"统计每个月卖了多少辆车": {
"sum": {"field": "salesVolume"}
}
}
}
}
}
5.3 统计每个月卖了多少宝马车
GET index_bucket_script/_search
{
"query": {"match_all": {}
},
"size": 0,
"aggs": {
"依据月份分组": {
"terms": {
"field": "month",
"order": {"_key": "asc"}
},
"aggs": {
"统计每个月卖了多少辆车": {
"sum": {"field": "salesVolume"}
},
"统计每个月卖了多少宝马车": {
"filter": {
"term": {"brand": "宝马"}
},
"aggs": {
"每个月卖出的宝马车辆数": {
"sum": {"field": "salesVolume"}
}
}
}
}
}
}
}
5.4 每个月宝马车销售占比
5.4.1 dsl
GET index_bucket_script/_search
{
"query": {"match_all": {}
},
"size": 0,
"aggs": {
"依据月份分组": {
"terms": {
"field": "month",
"order": {"_key": "asc"}
},
"aggs": {
"统计每个月卖了多少辆车": {
"sum": {"field": "salesVolume"}
},
"统计每个月卖了多少宝马车": {
"filter": {
"term": {"brand": "宝马"}
},
"aggs": {
"每个月卖出的宝马车辆数": {
"sum": {"field": "salesVolume"}
}
}
},
"每个月宝马车销售占比": {
"bucket_script": {
"buckets_path": {
"fenzi": "统计每个月卖了多少宝马车 > 每个月卖出的宝马车辆数",
"fenmu": "统计每个月卖了多少辆车"
},
"script": "params.fenzi / params.fenmu * 100"
}
}
}
}
}
}
5.4.2 java
@Test
@DisplayName("统计宝马车每个月销售率")
public void test01() throws IOException {
SearchRequest request = SearchRequest.of(searchRequest ->
searchRequest.index(INDEX_PERSON)
.query(query -> query.matchAll(matchAll -> matchAll))
.size(0)
.aggregations("依据月份分组", monthAggr ->
monthAggr.terms(terms -> terms.field("month").order(NamedValue.of("_key", SortOrder.Asc)
))
.aggregations("统计每个月卖了多少辆车", agg1 ->
agg1.sum(sum -> sum.field("salesVolume"))
)
.aggregations("统计每个月卖了多少宝马车", agg2 ->
agg2.filter(filter -> filter.term(term -> term.field("brand").value("宝马")))
.aggregations("每个月卖出的宝马车辆数", agg3 ->
agg3.sum(sum -> sum.field("salesVolume"))
)
)
.aggregations("每个月宝马车销售占比", rateAggr ->
rateAggr.bucketScript(bucketScript ->
bucketScript.bucketsPath(path ->
path.dict(new HashMap<String, String>() {
{put("fenzi", "统计每个月卖了多少宝马车 > 每个月卖出的宝马车辆数");
put("fenmu", "统计每个月卖了多少辆车");
}
}
)
)
.script(script ->
script.inline(inline -> inline.source("params.fenzi/params.fenmu"))
)
.format("#%")
)
)
)
);
System.out.println("request:" + request);
SearchResponse<String> response = client.search(request, String.class);
System.out.println("response:" + response);
}
5.4.3 运行后果
5、残缺代码
https://gitee.com/huan1993/spring-cloud-parent/blob/master/es/es8-api/src/main/java/com/huan/es8/aggregations/pipeline/BucketScript 统计宝马车每个月销售率.java
6、参考文档
1、https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html#buckets-path-syntax
2、https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-bucket-script-aggregation.html
3、https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/text/DecimalFormat.html