关于elasticsearch:elasticsearch的开发应用2

在第一篇文章中，咱们曾经能够通过 docker 装置 elasticsearch 和 kibana 了。那么这次就间接进入实战演练。

咱们会先筹备数据，针对不同常见利用场景，而后别离通过 Query DSL 和 Spring Data JPA 来实现。

Query DSL：ElasticSearch提供了一个能够执行的JSON格调的DSL(domain-specific language 畛域特定语言)，这个被称为Query DSL。

1. 筹备

1.1. 索引数据筹备

上面就是通过 Query DSL 保护了一个名为 operation_log 的索引，用于记录零碎中各个模块的操作日志。

1. 创立索引

PUT /operation_log

2. 保护mapping构造

PUT /operation_log/_mapping{  "properties": {    "ip": {      "type": "keyword"    },    "trace_id": {      "type": "keyword"    },    "operation_time": {      "type": "date",      "format": "yyyy-MM-dd HH:mm:ss"    },    "module": {      "type": "keyword"    },    "action_code": {      "type": "keyword"    },    "location": {      "type": "text",      "analyzer": "ik_max_word",      "fields": {        "keyword": {          "type": "keyword"        }      }    },    "object_id": {      "type": "keyword"    },    "object_name": {      "type": "text",      "analyzer": "ik_max_word",      "fields": {        "keyword": {          "type": "keyword"        }      }    },    "operator_id": {      "type": "keyword"    },    "operator_name": {      "type": "keyword"    },    "operator_dept_id": {      "type": "keyword"    },    "operator_dept_name": {      "type": "text",      "analyzer": "ik_max_word",      "fields": {        "keyword": {          "type": "keyword"        }      }    },    "changes": {      "type": "nested",      "properties": {        "field_name": {          "type": "keyword"        },        "old_value": {          "type": "keyword"        },        "new_value": {          "type": "keyword"        }      }    }  }}

3. 新建文档

上面一个个文档一一的新增，其实也是能够通过 _bulk 批量新增的，这里还是先依照根底的来。

POST /operation_log/_doc{  "ip": "10.1.11.1",  "trace_id": "670021ff9a2dc6b7",  "operation_time": "2022-05-02 09:31:18",  "module": "企业组织",  "action_code": "UPDATE",  "location": "企业组织->员工治理->身份治理",  "object_id": "xxxxx-1",  "object_name": "成德善",  "operator_id": "operator_id-1",  "operator_name": "张三",  "operator_dept_id": "operator_dept_id-1",  "operator_dept_name": "研发核心-后端一部",  "changes": [    {      "field_name": "手机号码",      "old_value": "13055660000",      "new_value": "13055770001"    },    {      "field_name": "姓名",      "old_value": "成德善",      "new_value": "成秀妍"    }  ]}// 同样的调用形式，再插入上面6个文档// data-2{  "ip": "22.1.11.0",  "trace_id": "990821e89a2dc653",  "operation_time": "2022-09-05 11:31:10",  "module": "资源核心",  "action_code": "UPDATE",  "location": "资源核心->文件治理->文件权限",  "object_id": "fffff-1",  "object_name": "《2022员工绩效打分细则》",  "operator_id": "operator_id-2",  "operator_name": "李四",  "operator_dept_id": "operator_dept_id-2",  "operator_dept_name": "人力资源部",  "changes": [    {      "field_name": "查看权限",      "old_value": "仅李四可查看",      "new_value": "全员可查看"    },    {      "field_name": "编辑权限",      "old_value": "仅李四可查看",      "new_value": "人力资源部可查看"    }  ]}// data-3{  "ip": "22.1.11.0",  "trace_id": "780821e89b2dc653",  "operation_time": "2022-10-02 12:31:10",  "module": "资源核心",  "action_code": "DELETE",  "location": "资源核心->文件治理",  "object_id": "fffff-1",  "object_name": "《2022员工绩效打分细则》",  "operator_id": "operator_id-3",  "operator_name": "王五",  "operator_dept_id": "operator_dept_id-2",  "operator_dept_name": "人力资源部",  "changes": []}// data-4{  "ip": "10.1.11.1",  "trace_id": "670021e89a2dc7b6",  "operation_time": "2022-05-03 09:35:10",  "module": "企业组织",  "action_code": "ADD",  "location": "企业组织->员工治理->身份治理",  "object_id": "xxxxx-2",  "object_name": "成宝拉",  "operator_id": "operator_id-1",  "operator_name": "张三",  "operator_dept_id": "operator_dept_id-1",  "operator_dept_name": "研发核心-后端一部",  "changes": [    {      "field_name": "姓名",      "new_value": "成宝拉"    },    {      "field_name": "性别",      "new_value": "女"    },    {      "field_name": "手机号码",      "new_value": "13055770002"    },    {      "field_name": "邮箱",      "new_value": "[email protected]"    }  ]}// data-5{  "ip": "10.1.11.5",  "trace_id": "670021e89a2dc655",  "operation_time": "2022-05-05 10:35:12",  "module": "企业组织",  "action_code": "DELETE",  "location": "企业组织->员工治理->身份治理",  "object_id": "xxxxx-1",  "object_name": "成德善",  "operator_id": "operator_id-2",  "operator_name": "李四",  "operator_dept_id": "operator_dept_id-2",  "operator_dept_name": "人力资源部",  "changes": []}// data-6{  "ip": "10.0.0.0",  "trace_id": "670021ff9a28ei6",  "operation_time": "2022-10-02 09:31:00",  "module": "资源核心",  "action_code": "DELETE",  "location": "资源核心->文件治理",  "object_id": "fffff-a",  "object_name": "《有空字符串的文档》",  "operator_id": "operator_id-a",  "operator_dept_id": "",  "operator_dept_name": "",  "operator_name": "路人A",  "changes": []}// data-7{  "ip": "10.0.0.0",  "trace_id": "670021ff9a28768",  "operation_time": "2022-10-02 09:32:00",  "module": "资源核心",  "action_code": "DELETE",  "location": "资源核心->文件治理",  "object_id": "fffff-b",  "object_name": "《有NULL的文档》",  "operator_id": "operator_id-b",  "operator_name": "路人B",  "changes": []}

1.2. spring 我的项目筹备

1. pom.xml

        <dependency>            <groupId>org.springframework.boot</groupId>            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>        </dependency>

引入了 spring-boot-starter-data-elasticsearch，咱们 spring-parent 版本是 2.7.4 的，即这里对应的 starter 版本也是 2.7.4。对应 spring-data-elasticsearch 版本是 4.4.3。

spring data 官网里有举荐 spring-data-elasticsearch 版本和 elasticsearch 版本的对应关系，倡议依照举荐同步版本，本例中 elasticsearch 版本就是 7.17.6。

而后下文中 spring 的代码，最好的教材还是去看 spring data 官网。

2. application

spring:  elasticsearch:    uris: http://localhost:9200  jackson:    default-property-inclusion: non_null

3. EO

索引对应的类须要加上 @Document，字段须要加上 @Field。

OperationLog.java

@Data@Document(indexName = "operation_log")public class OperationLog {    @Id    private String id;    @Field(type = FieldType.Keyword)    private String ip;    @Field(value = "trace_id", type = FieldType.Keyword)    private String traceId;    // format={} 不能少    @Field(value = "operation_time", type = FieldType.Date, format = {}, pattern = "yyyy-MM-dd HH:mm:ss")    @JsonFormat(pattern = "yyyy.MM.dd HH:mm:ss", timezone = "GMT+8")    private LocalDateTime operationTime;    @Field(type = FieldType.Keyword)    private String module;    @Field(value = "action_code", type = FieldType.Keyword)    private String actionCode;    @Field(type = FieldType.Text, analyzer = "ik_max_word")    private String location;    @Field(value = "object_id", type = FieldType.Keyword)    private String objectId;    @Field(value = "object_name", type = FieldType.Text, analyzer = "ik_max_word")    private String objectName;    @Field(value = "operator_id", type = FieldType.Keyword)    private String operatorId;    @Field(value = "operator_name", type = FieldType.Keyword)    private String operatorName;    @Field(value = "operator_dept_id", type = FieldType.Keyword)    private String operatorDeptId;    @Field(value = "operator_dept_name", type = FieldType.Text, analyzer = "ik_max_word")    private String operatorDeptName;    @Field(type = FieldType.Nested)    private List<OperationLogChange> changes;}

OperationLogChange.java

@Datapublic class OperationLogChange {    @Field(value = "field_name", type = FieldType.Keyword)    private String fieldName;    @Field(value = "old_value", type = FieldType.Keyword)    private String oldValue;    @Field(value = "new_value", type = FieldType.Keyword)    private String newValue;}

2. 查问

我集体不太喜爱通过继承 ElasticsearchRepository 来实现 Dao层办法，次要是应用局限性太大，不灵便。官网文档也不太举荐这种，而是比拟推崇调用 ElasticsearchRestTemplate 办法。

在官网查问的章节中，有介绍过3种办法：

CriteriaQuery：规范的查问形式，简略的查问还行，但针对一些简单的查问就有些顾此失彼了。
NativeSearchQuery：原生的查问形式，基本上和 Query DSL 外面的语法逻辑很类似，所以不放心搞不定简单的查问。
StringQuery：间接反对执行 Query DSL 字符串。

我集体是举荐 NativeSearchQuery，如果哪天真的面对搞不定的查问，能够偶然尝试一下 StringQuery。所以，下文中所有 spring 的例子，都是基于 NativeSearchQuery的。

2.1. match_all

1. DSL

GET /operation_log/_search{  "query": {    "match_all": {}  }}

2. spring

@AllArgsConstructor@RestController@RequestMapping("/dql")public class DqlController {    private final ElasticsearchRestTemplate esRestTemplate;    @GetMapping("")    public List<OperationLog> findAll() {        Query query = new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.matchAllQuery())                .build();        return esRestTemplate.search(query, OperationLog.class).stream()                .map(SearchHit::getContent)                .collect(Collectors.toList());    }}

2.2. match(term)

1. DSL

GET /operation_log/_search{  "query": {    "match": {      "module": "资源核心"    }  }}

2. spring

        Query query =new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.matchQuery("module", "资源核心"))                .build();        return esRestTemplate.search(query, OperationLog.class).stream()                .map(SearchHit::getContent)                .collect(Collectors.toList());

2.3. nested

1. DSL

GET operation_log/_search{  "query": {    "nested": {      "path": "changes",      "query": {        "term": {          "changes.field_name": "姓名"        }      }    }  }}

2. spring

        Query query = new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.nestedQuery("changes",                        QueryBuilders.termQuery("changes.field_name", "姓名"),                        ScoreMode.None))                .build();        return esRestTemplate.search(query, OperationLog.class).stream()                .map(SearchHit::getContent)                .collect(Collectors.toList());

2.4. bool(and) - 1

1. DSL

GET operation_log/_search{  "query": {    "bool": {      "must": [        {          "term": {            "action_code": "UPDATE"          }        },        {          "nested": {            "path": "changes",            "query": {              "term": {                "changes.field_name": "姓名"              }            }          }        }      ]    }  }}

2. spring

        Query query = new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.boolQuery()                        .must(QueryBuilders.termQuery("action_code", "UPDATE"))                        .must(QueryBuilders.nestedQuery("changes",                                QueryBuilders.termQuery("changes.field_name", "姓名"), ScoreMode.None)))                .build();        return esRestTemplate.search(query, OperationLog.class).stream()                .map(SearchHit::getContent)                .collect(Collectors.toList());

2.5. bool(and) - 2

1. DSL

GET operation_log/_search{  "query": {    "bool": {      "must_not": [        {          "term": {            "action_code": "UPDATE"          }        }      ],      "must": [        {          "nested": {            "path": "changes",            "query": {              "term": {                "changes.field_name": "姓名"              }            }          }        }      ]    }  }}

2. spring

        Query query = new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.boolQuery()                        .mustNot(QueryBuilders.termQuery("action_code", "UPDATE"))                        .must(QueryBuilders.nestedQuery("changes",                                QueryBuilders.termQuery("changes.field_name", "姓名"), ScoreMode.None)))                .build();        return esRestTemplate.search(query, OperationLog.class).stream()                .map(SearchHit::getContent)                .collect(Collectors.toList());

2.6. bool(or)、exist

1. DSL

GET /operation_log/_search{  "query": {    "bool": {      "should": [        {          "bool": {            "must": [              {                "term": {                  "operator_dept_name.keyword": ""                }              }            ]          }        },        {          "bool": {            "must_not": [              {                "exists": {                  "field": "operator_dept_name"                }              }            ]          }        }      ]    }  }}

2. spring

        Query query = new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.boolQuery()                        .should(QueryBuilders.boolQuery()                                .must(QueryBuilders.termQuery("operator_dept_name.keyword", "")))                        .should(QueryBuilders.boolQuery()                                .mustNot(QueryBuilders.existsQuery("operator_dept_name"))))                .build();        return esRestTemplate.search(query, OperationLog.class).stream()                .map(SearchHit::getContent)                .collect(Collectors.toList());

2.7. _source、sort

如果只想查问索引中某几个字段，就能够用到 _source，其中蕴含两个属性：

includes：查问后果蕴含某些字段。
excludes：查问后果屏蔽某些字段。

当二者同时呈现时，优先级上：excludes > includes。
当只有_source中 includes 时，能够疏忽 includes 不写，间接 "_source":[field,...]

sort 可用于排序。

1. DSL

GET /operation_log/_search{  "query": {    "match": {      "location": "文件"    }  },  "_source": {    "includes": [      "module",      "location",      "operator_name",      "operation_time",       "changes.field_name"    ],    "excludes": [      "module"    ]  },"sort": [    {      "operation_time": {        "order": "asc"      }    }  ]}// 也等同于{  "query": {    "match": {      "location": "文件"    }  },  "_source": [    "location",    "operator_name",    "operation_time",    "changes.field_name"  ],  "sort": [    {      "operation_time": {        "order": "asc"      }    }  ]}

2. spring

        SourceFilter sourceFilter = new FetchSourceFilterBuilder()                .withIncludes("module", "location", "operator_name", "operation_time", "changes.field_name")                .withExcludes("module")                .build();        Query query = new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.matchQuery("location", "文件"))                .withSourceFilter(sourceFilter)                .withSort( Sort.by(Sort.Direction.ASC, "operation_time"))                .build();        return esRestTemplate.search(query, OperationLog.class).stream()                .map(SearchHit::getContent)                .collect(Collectors.toList());

8. highlight

这里次要介绍一下highlight里的标签

pre_tags、post_tags:这两个标签定义了宰割出的后果以什么tag包围起来，和咱们前端的<></>成果差不多

fields：定义要高亮搜寻的属性，name代表名称要高亮，keyWords代表关键词要高亮

1. DSL

GET /operation_log/_search{"query": {  "match": {    "location": "文件"  }},"highlight": {  "fields": {    "location": {}  },  "pre_tags": "<span style='color:red'>",  "post_tags": "</span>"}}

2. spring

        String matchField = "location";        HighlightBuilder highlightBuilder = new HighlightBuilder()                .field(matchField)                .preTags("<span style='color:red'>")                .postTags("</span>");        Query query = new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.matchQuery(matchField, "文件"))                .withHighlightBuilder(highlightBuilder)                .build();        return esRestTemplate.search(query, OperationLog.class).stream()                .map(hit -> {                    OperationLog operationLog = hit.getContent();                    operationLog.setLocation(hit.getHighlightField(matchField).get(0));                    return operationLog;                })                .collect(Collectors.toList());

9. pageable

1. DSL

GET /operation_log/_search{  "query": {    "match_all": {}  },  "from": 0,  "size": 5,  "sort": [    {      "operation_time": {        "order": "desc"      }    }  ]}

2. spring

        Query query = new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.matchAllQuery())                .withPageable(PageRequest.of(0, 5, Sort.by(Sort.Direction.DESC,"operation_time")))                .build();        return esRestTemplate.search(query, OperationLog.class).stream()                .map(SearchHit::getContent)                .collect(Collectors.toList());

3. 批改

3.1. 单文档批改

3.1.1. insert

其实在数据筹备阶段曾经有新增的例子了。

DSL

POST /operation_log/_doc{  "ip": "0.0.0.0",  "module": "测试数据"}

spring

        OperationLog operationLog=new OperationLog();        operationLog.setIp("0.0.0.0");        operationLog.setModule("测试数据");        return esRestTemplate.save(operationLog);

3.1.2. update-(save)

新增时，springboot 用到的是 save 办法，更新时也一样能够。不过得拿到文档的id，这里id=13OkA4QBMgWicIn2wBwM。

DSL

PUT /operation_log/_doc/13OkA4QBMgWicIn2wBwM{  "ip": "0.0.0.0",  "module": "测试数据1"}

spring

esRestTemplate.save(operationLog);

3.1.3. update-(document)

DSL

POST /operation_log/_update/13OkA4QBMgWicIn2wBwM{  "doc": {    "module":"测试数据1"  }}

spring

        Document document = Document.create();        document.put("module", "测试数据1");        UpdateQuery updateQuery = UpdateQuery                .builder(id)                .withDocument(document)                .build();        esRestTemplate.update(updateQuery,IndexCoordinates.of("operation_log"));

3.1.4. update-(script)

DSL

POST /operation_log/_update/13OkA4QBMgWicIn2wBwM{  "script": {    "source": "ctx._source.module = params.module",    "params": {      "module": "测试数据1"    }  }}

spring

        Map<String, Object> params = new HashMap<>();        params.put("module", "测试数据1");        UpdateQuery updateQuery = UpdateQuery                .builder(id)                .withScript("ctx._source.module = params.module")                .withParams(params)                .build();        esRestTemplate.update(updateQuery, IndexCoordinates.of("operation_log"));

3.1.5. delete

DSL

DELETE /operation_log/_doc/13OkA4QBMgWicIn2wBwM

spring

        esRestTemplate.delete(id, OperationLog.class);

3.2. 批量批改 bulk

批量新增 DSL

POST /operation_log/_bulk{"create":{"_index":"operation_log"}}{"ip":"0.0.0.0","module":"测试数据1"}{"create":{"_index":"operation_log"}}{"ip":"0.0.0.0","module":"测试数据2"}{"create":{"_index":"operation_log"}}{"ip":"0.0.0.0","module":"测试数据3"}

批量更新 DSL

POST /operation_log/_bulk{"update":{"_id":"2HP9A4QBMgWicIn26BzR"}}{"doc":{"module":"测试数据11"}}{"update":{"_id":"2XP9A4QBMgWicIn26BzR"}}{"script":{"source":"ctx._source.module = params.module","params":{"module":"测试数据22"}}}

批量删除 DSL

POST /operation_log/_bulk{"delete":{"_id":"2HP9A4QBMgWicIn26BzR"}}{"delete":{"_id":"2XP9A4QBMgWicIn26BzR"}}{"delete":{"_id":"2nP9A4QBMgWicIn26BzR"}}

不知是否留神到，在批量更新的语句中，反对同时 doc、script 两种更新形式。实际上来说，_bulk 其实反对同时将上述的三种语句一起提交执行。
不过我的项目上个别不会如此利用，都是独自离开来。像批量新增，save 办法就反对批量新增操作，尽管底层代码还是调用 bulkOperation。

spring bulkUpdate

    @PatchMapping("bulk-update")    public void bulkUpdate() {        Map<String, Object> params = new HashMap<>();        params.put("module", "测试数据2");        String scriptStr = "ctx._source.module = params.module";        Query query = new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.termQuery("ip", "0.0.0.0"))                .build();        List<UpdateQuery> updateQueryList = esRestTemplate.search(query, OperationLog.class)                .stream()                .map(SearchHit::getContent)                .map(obj -> UpdateQuery.builder(obj.getId())                        .withScript(scriptStr)                        .withParams(params)                        .build())                .collect(Collectors.toList());        esRestTemplate.bulkUpdate(updateQueryList, OperationLog.class);    }

3.3. updateByQuery

DSL

POST /operation_log/_update_by_query{  "script": {    "source": "ctx._source.module = params.module",    "params": {      "module": "测试数据1"    }  },  "query": {    "term": {      "ip": "0.0.0.0"    }  }}

spring

    @PatchMapping("update-by-query")    public void updateByQuery() {        Map<String, Object> params = new HashMap<>();        params.put("module", "测试数据2");        String scriptStr = "ctx._source.module = params.module";        UpdateQuery updateQuery = UpdateQuery                .builder(new NativeSearchQueryBuilder()                        .withQuery(QueryBuilders.termQuery("ip", "0.0.0.0"))                        .build())                .withScript(scriptStr)                .withScriptType(ScriptType.INLINE)                .withLang("painless")                .withParams(params)                .build();        esRestTemplate.updateByQuery(updateQuery, IndexCoordinates.of("operation_log"));    }

能够比照一下下面的 bulkUpdate 办法，发现有些不同：

updateByQuery 只反对Script，不反对 Document 的形式更新。
updateByQuery 应用 Script 形式更新时，必须传递 scriptType、Lang 这些辅助参数。本来 bulkUpdate 中也是要传的，只不过底层办法封装了，然而没有给 updateByQuery 封装。（理论踩过坑，看封装办法才得悉）

3.4. deleteByQuery

DSL

POST /operation_log/_delete_by_query{  "query": {    "term": {      "ip": "0.0.0.0"    }  }}

spring

        Query query = new NativeSearchQueryBuilder()                .withQuery(QueryBuilders.termQuery("ip", "0.0.0.0"))                .build();        esRestTemplate.delete(query, OperationLog.class);

delete_by_query并不是真正意义上物理文档删除，而是只是版本变动并且对文档减少了删除标记。当咱们再次搜寻的时候，会搜寻全副而后过滤掉有删除标记的文档。因而，该索引所占的空间并不会随着该API的操作磁盘空间会马上开释掉，只有等到下一次段合并的时候才真正被物理删除，这个时候磁盘空间才会开释。相同，在被查问到的文档标记删除过程同样须要占用磁盘空间，这个时候，你会发现触发该API操作的时候磁盘岂但没有被开释，反而磁盘使用率回升了。