elasticsearch学习笔记

39次阅读

共计 9003 个字符,预计需要花费 23 分钟才能阅读完成。

参考【阮一峰:全文搜索引擎 Elasticsearch 入门教程】

1、安装初始化环境

a. 使用单机开发版

# 数据保存到本地
docker run -d --name es --restart=always \
    --net mybridge --ip=172.1.111.12 \
    -v /home/tools/elasticsearch/single/data/:/usr/share/elasticsearch/data/ \
    -e "discovery.type=single-node" \
    elasticsearch:7.7.0

b. 数据库请求

启动检查
### 系统 info
curl -X GET http://172.1.111.12:9200
### 分片索引列表
curl -X GET 'http://172.1.111.12:9200/_cat/indices?v'
### 分片索引列表 info
curl -X GET 'http://172.1.111.12:9200/_mapping?pretty=true'
安装中文分词插件,数据库初始化
docker exec -it es bash
./bin/elasticsearch-plugin install 'https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.7.0/elasticsearch-analysis-ik-7.7.0.zip'
exit
docker restart es

### 建立索引目录 accounts
curl -X GET 'http://172.1.111.12:9200/accounts'
### 删除索引
curl -X DELETE 'http://172.1.111.12:9200/accounts'

c. 索引类型修改

# 分词设置和修改,可以添加时已经指定、无需修改
# https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html
# https://github.com/medcl/elasticsearch-analysis-ik
curl -X PUT 'http://172.1.111.12:9200/accounts' -d '{"settings": {"index": {"analysis.analyzer.default.type":"ik_max_word"}
    }
}'
# 中文语句分词查询测试:analyzer 项可以省略,[ik_max_word|standard]
curl -X PUT 'http://172.1.111.12:9200/accounts/_analyze' -d '{"analyzer":"standard","text":" ES 的默认分词设置是 standard,这个在中文分词时就比较尴尬了,会单字拆分 "}' -H 'Content-Type:application/json'

2、增删改查操作

建库建表:建立索引下的对象

这里注意 POST、PUT 参数 -H 'Content-Type:application/json' 参数。

Content-Type header [application/x-www-form-urlencoded] is not supported

# 新建一个名称为 accounts 的 Index,里面有一个名称为 person 的 Type。person 有三个字段
curl -X PUT 'http://172.1.111.12:9200/accounts' -d '{"mappings": {"person": {"properties": {"user": {"type":"text","analyzer":"ik_max_word","search_analyzer":"ik_max_word"},"title": {"type":"text","analyzer":"ik_max_word","search_analyzer":"ik_max_word"},"desc": {"type":"text","analyzer":"ik_max_word","search_analyzer":"ik_max_word"}
      }
    }
  }
}'-H'Content-Type:application/json'

a. 数据添加

curl -X PUT 'http://172.1.111.12:9200/accounts/person/2' -d '{"user":" 张三 ","title":" 工程师 ","desc":" 数据库管理 "}' -H 'Content-Type:application/json'
# 返回
{
    "_index": "accounts",
    "_type": "person",
    "_id": "1",
    "_version": 1,
    "result": "created",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 0,
    "_primary_term": 2
}
# 再添加一条
curl -X PUT 'http://172.1.111.12:9200/accounts/person/2' -d '{"user":" 张三 2","title":" 工程师 ","desc":" 数据库管理 "}' -H 'Content-Type:application/json'

b. 数据查看

# 查询总人数,不加 - d 报错
curl -X POST 'http://172.1.111.12:9200/accounts/person' -d '{}' -H 'Content-Type:application/json'
# 返回
{
    "_index": "accounts",
    "_type": "person",
    "_id": "5oypSXIBsIkVNyXrn1o5",
    "_version": 1,
    "result": "created",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "_seq_no": 3,
    "_primary_term": 3
}

# 查询单条记录
curl -X GET 'http://172.1.111.12:9200/accounts/person/1'
# 返回
{"_index":"accounts","_type":"person","_id":"1","_version":1,"_seq_no":0,"_primary_term":2,"found":true,"_source":
{
  "user": "张三",
  "title": "工程师",
  "desc": "数据库管理"
}}

c. 数据删除

curl -X DELETE 'http://172.1.111.12:9200/accounts/person/2'
# 返回
{
    "_index": "accounts",
    "_type": "person",
    "_id": "2",
    "_version": 2,
    "result": "deleted",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    }
}

d. 数据更新

curl -X PUT 'http://172.1.111.12:9200/accounts/person/2' -d '{"user":" 张三 2","title":" 工程师 ","desc":" 数据库管理 "}' -H 'Content-Type:application/json'
# 返回
{"_index":"accounts","_type":"person","_id":"2","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":6,"_primary_term":3}
curl -X PUT 'http://172.1.111.12:9200/accounts/person/2' -d '{"user":" 张三 233","title":" 工程师 ","desc":" 数据库管理 "}' -H 'Content-Type:application/json'
# 返回
{"_index":"accounts","_type":"person","_id":"2","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":7,"_primary_term":3}

两次返回情况:

_version: 增加
_seq_no: 增加
created -> updated

3、复杂查询

a. 查询列表页

curl -X PUT http://172.1.111.12:9200/accounts/person/_search
# 返回
{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 5,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [{
            "_index": "accounts",
            "_type": "person",
            "_id": "1",
            "_score": 1.0,
            "_source": {
                "user": "张三",
                "title": "工程师",
                "desc": "数据库管理"
            }
        }, {
            "_index": "accounts",
            "_type": "person",
            "_id": "sYynSXIBsIkVNyXrfVpX",
            "_score": 1.0,
            "_source": {
                "analyzer": "ik_max_word",
                "text": "ES 的默认分词设置是 standard,这个在中文分词时就比较尴尬了,会单字拆分"
            }
        }, {
            "_index": "accounts",
            "_type": "person",
            "_id": "5oypSXIBsIkVNyXrn1o5",
            "_score": 1.0,
            "_source": {}}, {
            "_index": "accounts",
            "_type": "person",
            "_id": "9IyqSXIBsIkVNyXrLVrA",
            "_score": 1.0,
            "_source": {}}, {
            "_index": "accounts",
            "_type": "person",
            "_id": "2",
            "_score": 1.0,
            "_source": {
                "user": "张三 233",
                "title": "工程师",
                "desc": "数据库管理"
            }
        }]
    }
}

由于前面在 ApiPost 的误操作,产生了一些垃圾数据。

total:返回记录数,本例是 2 条。max_score:最高的匹配程度,本例是 1.0。hits:返回的记录组成的数组。

b. 匹配查询

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-script-query.html

指定标题搜索 [=]
curl -X GET 'http://172.1.111.12:9200/accounts/person/_search'  -d '{"query": {"bool": {"must": [{"match": {"title":" 师 "}
            }],
            "filter": []}
    }
}'-H'Content-Type:application/json'
# 返回
{
    "took": 2,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": 0.10536051,
        "hits": [{
            "_index": "accounts",
            "_type": "person",
            "_id": "1",
            "_score": 0.10536051,
            "_source": {
                "user": "张三",
                "title": "工程师",
                "desc": "数据库管理"
            }
        }, {
            "_index": "accounts",
            "_type": "person",
            "_id": "2",
            "_score": 0.10536051,
            "_source": {
                "user": "张三 233",
                "title": "工程师",
                "desc": "数据库管理"
            }
        }]
    }
}
增加词量
curl -X GET 'http://172.1.111.12:9200/accounts/person/_search'  -d '{"query": {"bool": {"must": [{"match": {"title":" 工程师 "}
            }],
            "filter": []}
    }
}'-H'Content-Type:application/json'
# 返回
{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": 0.31608152,
        "hits": [{
            "_index": "accounts",
            "_type": "person",
            "_id": "1",
            "_score": 0.31608152,
            "_source": {
                "user": "张三",
                "title": "工程师",
                "desc": "数据库管理"
            }
        }, {
            "_index": "accounts",
            "_type": "person",
            "_id": "2",
            "_score": 0.31608152,
            "_source": {
                "user": "张三 233",
                "title": "工程师",
                "desc": "数据库管理"
            }
        }]
    }
}

会提高 hits.hits._score。

# 添加 size,限制返回数
{"query": {"bool":{"must":[{"match":{"title":"工程师"}}],"filter":[]}},
    "size": 1
}

c. 多条件查询

测试数据

curl -X PUT 'http://172.1.111.12:9200/accounts/person/3' -d '{"user":" 李明 ","title":" 工程师 学习 ","desc":" 运维 架构师 "}' -H 'Content-Type:application/json'

curl -X PUT 'http://172.1.111.12:9200/accounts/person/4' -d '{"user":" 夏天 ","title":" 工程师 教育 ","desc":" 软件 系统 "}' -H 'Content-Type:application/json'

curl -X PUT 'http://172.1.111.12:9200/accounts/person/5' -d '{"user":" 呃呃 ","title":" 呵呵 123 学习 ","desc":" 系统 运维 学习 "}' -H 'Content-Type:application/json'

curl -X PUT 'http://172.1.111.12:9200/accounts/person/6' -d '{"user":" 明白 ","title":" 问问 123 学习 ","desc":" 系统 软件 看看 "}' -H 'Content-Type:application/json'

and 搜索,必须使用布尔查询

### or 并集查询
curl 'http://172.1.111.12:9200/accounts/person/_search'  -d '{"query": {"match": {"desc":" 软件 系统 "}}
}'-H'Content-Type: application/json'
# 返回
{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 3,
            "relation": "eq"
        },
        "max_score": 5.4819746,
        "hits": [{
            "_index": "accounts",
            "_type": "person",
            "_id": "4",
            "_score": 5.4819746,
            "_source": {
                "user": "夏天",
                "title": "工程师 教育",
                "desc": "软件 系统"
            }
        }, {
            "_index": "accounts",
            "_type": "person",
            "_id": "6",
            "_score": 4.7037764,
            "_source": {
                "user": "明白",
                "title": "问问 123 学习",
                "desc": "系统 软件 看看"
            }
        }, {
            "_index": "accounts",
            "_type": "person",
            "_id": "5",
            "_score": 1.7504241,
            "_source": {
                "user": "呃呃",
                "title": "呵呵 123 学习",
                "desc": "系统 运维 学习"
            }
        }]
    }
}

### 布尔查询交集
curl 'http://172.1.111.12:9200/accounts/person/_search'  -d '{"query": {"bool": {"must": [{ "match": { "desc": "软件"} },
        {"match": { "desc": "系统"} }
      ]
    }
  }
}'-H'Content-Type: application/json'
# 返回
{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": 5.4819746,
        "hits": [{
            "_index": "accounts",
            "_type": "person",
            "_id": "4",
            "_score": 5.4819746,
            "_source": {
                "user": "夏天",
                "title": "工程师 教育",
                "desc": "软件 系统"
            }
        }, {
            "_index": "accounts",
            "_type": "person",
            "_id": "6",
            "_score": 4.7037764,
            "_source": {
                "user": "明白",
                "title": "问问 123 学习",
                "desc": "系统 软件 看看"
            }
        }]
    }
}
布尔查询排除
query.bool: {must: [],must_not: []}

d. 分页排序

排序输出 sort
# sort by asc/desc
{"query": { "match_all": {"desc": "软件"} },
  "sort": [{ "_id": "asc"}
  ]
}

返回

{
    "took": 22,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 3,
            "relation": "eq"
        },
        "max_score": null,
        "hits": [{
            "_index": "accounts",
            "_type": "person",
            "_id": "4",
            "_score": null,
            "_source": {
                "user": "夏天",
                "title": "工程师 教育",
                "desc": "软件 系统"
            },
            "sort": ["4"]
        }, {
            "_index": "accounts",
            "_type": "person",
            "_id": "5",
            "_score": null,
            "_source": {
                "user": "呃呃",
                "title": "呵呵 123 学习",
                "desc": "系统 运维 学习"
            },
            "sort": ["5"]
        }, {
            "_index": "accounts",
            "_type": "person",
            "_id": "6",
            "_score": null,
            "_source": {
                "user": "明白",
                "title": "问问 123 学习",
                "desc": "系统 软件 看看"
            },
            "sort": ["6"]
        }]
    }
}
分页输出 from-size
{
    "query": {"match_all": {"desc": "软件"}
    },
    "sort": [{"account_number": "asc"}],
    "from": 10,
    "size": 10
}

f. 有限范围查询

# filter.range.[范围字段 balance]: {"gte": 2000,"lte": 30000}
{
  "query": {
    "bool": {"must": { "match_all": {} },
      "filter": {
        "range": {
          "balance": {
            "gte": 20000,
            "lte": 30000
          }
        }
      }
    }
  }
}

总结

a. 查询

query.match_all

简单查询

query.match: {字段 1: 值 1, 字段 2: 值 2}

交集查询

query.bool: {must: [match{}],must_not: []}

排序分页

sort: [{“_id”: “asc”}]
from: 10,
size: 10

限定范围

filter.range.[范围字段 balance]: {“gte”: 2000,”lte”: 30000}

b. 官方文档

查询文档 https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html

c. 小结

使用约定的 JSON 数组查询,省去了 DDL 语句解析的时间;典型的使用 RESTFUL-API 风格请求响应,可以简化后端操作,基本功能甚至不需要后端开发的参与。

正文完
 0