共计 7258 个字符,预计需要花费 19 分钟才能阅读完成。
上一章节主要介绍了 ES 的一些重要概念及简单的 CRUD,本章内容将重点介绍 ES 的多种查询方式。ES 在使用过程中,查询是最重要的应用场景。
一、Query String Search(‘Query String’方式的搜索)
1. 搜索全部商品
GET /shop_index/productInfo/_search
返回结果:
{
“took”: 8,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 3,
“max_score”: 1,
“hits”: [
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “2”,
“_score”: 1,
“_source”: {
“test”: “test”
}
},
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “zyWpRGkB8mgaHjxk0Hfo”,
“_score”: 1,
“_source”: {
“name”: “HuaWei P20”,
“desc”: “Expen but easy to use”,
“price”: 5300,
“producer”: “HuaWei Producer”,
“tags”: [
“Expen”,
“Fast”
]
}
},
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “1”,
“_score”: 1,
“_source”: {
“name”: “HuaWei Mate8”,
“desc”: “Cheap and easy to use”,
“price”: 2500,
“producer”: “HuaWei Producer”,
“tags”: [
“Cheap”,
“Fast”
]
}
}
]
}
}
字段解释:
took: 耗费了几毫秒
timed_out: 是否超时,这里是没有
_shards: 数据被拆到了 5 个分片上,搜索时使用了 5 个分片,5 个分片都成功地返回了数据,失败了 0 个,跳过了 0 个
hits.total: 查询结果的数量,3 个 document
max_score: 就是 document 对于一个 search 的相关度的匹配分数,越相关,就越匹配,分数也越高
hits.hits: 包含了匹配搜索的 document 的详细数据
2. 搜索商品名称中包含 HuaWei 的商品,而且按照售价降序排序:下面这种方法也是 ”Query String Search” 的由来,因为 search 参数都是以 http 请求的 query string 来附带的.
GET /shop_index/productInfo/_search?q=name:HuaWei&sort=price:desc
返回结果:
{
“took”: 23,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 2,
“max_score”: null,
“hits”: [
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “zyWpRGkB8mgaHjxk0Hfo”,
“_score”: null,
“_source”: {
“name”: “HuaWei P20”,
“desc”: “Expen but easy to use”,
“price”: 5300,
“producer”: “HuaWei Producer”,
“tags”: [
“Expen”,
“Fast”
]
},
“sort”: [
5300
]
},
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “1”,
“_score”: null,
“_source”: {
“name”: “HuaWei Mate8”,
“desc”: “Cheap and easy to use”,
“price”: 2500,
“producer”: “HuaWei Producer”,
“tags”: [
“Cheap”,
“Fast”
]
},
“sort”: [
2500
]
}
]
}
}
二、Query DSL(DSL: Domain Specified Language,特定领域的语言)
这种方法是通过一个 json 格式的 http request body 请求体作为条件,可以完成多种复杂的查询需求,比 query string 的功能更加强大 1. 搜索所有商品
GET /shop_index/productInfo/_search
{
“query”: {
“match_all”: {}
}
}
返回结果省略 …
2. 查询名称中包含 HuaWei 的商品,并且按照价格降序排列
GET /shop_index/productInfo/_search
{
“query”: {
“match”: {
“name”: “HuaWei”
}
},
“sort”: [
{
“price”: {
“order”: “desc”
}
}
]
}
返回结果省略 …
3. 分页查询第二页,每页 1 条记录
GET /shop_index/productInfo/_search
{
“query”: {
“match_all”: {}
},
“from”: 1,
“size”: 1
}
返回结果:
{
“took”: 6,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 3,
“max_score”: 1,
“hits”: [
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “zyWpRGkB8mgaHjxk0Hfo”,
“_score”: 1,
“_source”: {
“name”: “HuaWei P20”,
“desc”: “Expen but easy to use”,
“price”: 5300,
“producer”: “HuaWei Producer”,
“tags”: [
“Expen”,
“Fast”
]
}
}
]
}
}
注意:(1) 在实际项目中,如果有条件查询之后再需要分页,不需要单独查询总条数,ES 会返回满足条件的总条数,可以直接使用;(2)ES 的分页默认 from 是从 0 开始的;
4. 只查询特定字段,比如:name,desc 和 price 字段,其他字段不需要返回
GET /shop_index/productInfo/_search
{
“query”: {
“match”: {
“name”: “HuaWei”
}
},
“_source”: [“name”,”desc”,”price”]
}
返回结果:
{
“took”: 27,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 2,
“max_score”: 0.2876821,
“hits”: [
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “zyWpRGkB8mgaHjxk0Hfo”,
“_score”: 0.2876821,
“_source”: {
“price”: 5300,
“name”: “HuaWei P20”,
“desc”: “Expen but easy to use”
}
},
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “1”,
“_score”: 0.2876821,
“_source”: {
“price”: 2500,
“name”: “HuaWei Mate8”,
“desc”: “Cheap and easy to use”
}
}
]
}
}
三.Query Filter(对查询结果进行过滤)
比如:查询名称中包含 HuaWei,并且价格大于 4000 的商品记录:
GET /shop_index/productInfo/_search
{
“query”: {
“bool”: {
“must”: [
{
“match”: {
“name”: “HuaWei”
}
}
],
“filter”: {
“range”: {
“price”: {
“gt”: 4000
}
}
}
}
}
}
返回结果:
{
“took”: 195,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 1,
“max_score”: 0.2876821,
“hits”: [
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “zyWpRGkB8mgaHjxk0Hfo”,
“_score”: 0.2876821,
“_source”: {
“name”: “HuaWei P20”,
“desc”: “Expen but easy to use”,
“price”: 5300,
“producer”: “HuaWei Producer”,
“tags”: [
“Expen”,
“Fast”
]
}
}
]
}
}
四、全文索引 (Full-Text Search)
搜索生产厂商字段中包含 ”HuaWei MateProducer” 的商品记录:
GET /shop_index/productInfo/_search
{
“query”: {
“match”: {
“producer”: “HuaWei MateProducer”
}
}
}
返回结果:
{
“took”: 8,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 4,
“max_score”: 0.5753642,
“hits”: [
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “SiUBRWkB8mgaHjxkJHyS”,
“_score”: 0.5753642,
“_source”: {
“name”: “HuaWei Mate10”,
“desc”: “Cheap and Beauti”,
“price”: 2300,
“producer”: “HuaWei MateProducer”,
“tags”: [
“Cheap”,
“Beauti”
]
}
},
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “1”,
“_score”: 0.2876821,
“_source”: {
“name”: “HuaWei Mate8”,
“desc”: “Cheap and easy to use”,
“price”: 2500,
“producer”: “HuaWei Producer”,
“tags”: [
“Cheap”,
“Fast”
]
}
},
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “zyWpRGkB8mgaHjxk0Hfo”,
“_score”: 0.18232156,
“_source”: {
“name”: “HuaWei P20”,
“desc”: “Expen but easy to use”,
“price”: 5300,
“producer”: “HuaWei Producer”,
“tags”: [
“Expen”,
“Fast”
]
}
},
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “CSX8RGkB8mgaHjxkV3w1”,
“_score”: 0.18232156,
“_source”: {
“name”: “HuaWei nova 4e”,
“desc”: “cheap and look nice”,
“price”: 1999,
“producer”: “HuaWei Producer”,
“tags”: [
“Cheap”,
“Nice”
]
}
}
]
}
}
从以上结果中可以看到:id 为 ”SiUBRWkB8mgaHjxkJHyS” 的记录 score 分数最高,表示匹配度最高; 原因:producer 分完词之后包括的词语有:(1).HuaWei: 匹配到改词的记录 ID:’SiUBRWkB8mgaHjxkJHyS’,’1′,’CSX8RGkB8mgaHjxkV3w1′,’zyWpRGkB8mgaHjxk0Hfo'(2).MateProducer: 匹配到该词的记录 ID:’SiUBRWkB8mgaHjxkJHyS’ 由于 ”HuaWei MateProducer” 两次匹配到 ID 为 ’SiUBRWkB8mgaHjxkJHyS’ 的记录,所以该记录的 score 分数最高。
五、Phrase Search(短语搜索)
短语索引和全文索引的区别:(1) 全文匹配:将要搜索的内容分词,然后挨个单词去倒排索引中匹配,只要匹配到任意一个单词,就算是匹配到记录;(2) 短语索引:输入的搜索串,必须在指定的字段内容中,完全包含一模一样的,才可以算匹配,才能作为结果返回; 例如:搜索 name 中包含 ”HuaWei MateProducer” 短语的商品信息:
GET /shop_index/productInfo/_search
{
“query”: {
“match_phrase”: {
“producer”: “HuaWei MateProducer”
}
}
}
返回结果:
{
“took”: 158,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 1,
“max_score”: 0.5753642,
“hits”: [
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “SiUBRWkB8mgaHjxkJHyS”,
“_score”: 0.5753642,
“_source”: {
“name”: “HuaWei Mate10”,
“desc”: “Cheap and Beauti”,
“price”: 2300,
“producer”: “HuaWei MateProducer”,
“tags”: [
“Cheap”,
“Beauti”
]
}
}
]
}
}
可以看到只有包含 ”HuaWei MateProducer” 的记录才被返回。
六、Highlight Search(搜索高亮显示)
高亮搜索指的是搜索的结果中,将某些特别需要强调的词使用特定的样式展示出来。例如:搜索商品名称中包含 ”Xiao’Mi” 的商品,并将搜索的关键词高亮显示:
GET /shop_index/productInfo/_search
{
“query”: {
“match”: {
“name”: “Xiao’Mi”
}
},
“highlight”: {
“fields”: {
“name”: {}
}
}
}
返回结果:
{
“took”: 348,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 1,
“max_score”: 0.2876821,
“hits”: [
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “HiX9RGkB8mgaHjxk4nxC”,
“_score”: 0.2876821,
“_source”: {
“name”: “Xiao’Mi 9”,
“desc”: “Expen but nice and Beauti”,
“price”: 3500,
“producer”: “XiaoMi Producer”,
“tags”: [
“Expen”,
“Beauti”
]
},
“highlight”: {
“name”: [
“<em>Xiao’Mi</em> 9”
]
}
}
]
}
}
可以看到,”Xiao’Mi” 使用了标签返回了,可以在 HTML 中直接以斜体展示。如果想使用自定义高亮样式,可以使用 pre_tags 和 post_tags 进行自定义,比如:想使用红色展示,如下所示:
GET /shop_index/productInfo/_search
{
“query”: {
“match”: {
“name”: “Xiao’Mi”
}
},
“highlight”: {
“fields”: {
“name”: {}
},
“pre_tags”: [
“<em style=’color:red;’>”
],
“post_tags”: [
“</em>”
]
}
}
返回结果:
{
“took”: 10,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 1,
“max_score”: 0.2876821,
“hits”: [
{
“_index”: “shop_index”,
“_type”: “productInfo”,
“_id”: “HiX9RGkB8mgaHjxk4nxC”,
“_score”: 0.2876821,
“_source”: {
“name”: “Xiao’Mi 9”,
“desc”: “Expen but nice and Beauti”,
“price”: 3500,
“producer”: “XiaoMi Producer”,
“tags”: [
“Expen”,
“Beauti”
]
},
“highlight”: {
“name”: [
“<em style=’color:red;’>Xiao’Mi</em> 9”
]
}
}
]
}
}
返回结果中的搜索关键字使用表示红色的 css 样式展示出来。
ES 的多种搜索方式到此介绍完毕,每种搜索方式中还包含其他的 API,文中没有全部介绍,如果有需要可以自行翻阅官方文档。欢迎评论转发!