关于elasticsearch:这篇-ElasticSearch-详细使用教程内部分享时被老大表扬了

本文介绍了ElasticSearch的必备常识：从入门、索引治理到映射详解。

一、疾速入门

1.查看集群的健康状况

http://localhost:9200/_cat

http://localhost:9200/_cat/he...

阐明：v是用来要求在后果中返回表头

#状态值阐明Green - everything is good (cluster is fully functional)，即最佳状态Yellow - all data is available but some replicas are not yet allocated (cluster is fully functional)，即数据和集群可用，然而集群的备份有的是坏的Red - some data is not available for whatever reason (cluster is partially functional)，即数据和集群都不可用

查看集群的节点

http://localhost:9200/_cat/?v

2. 查看所有索引

http://localhost:9200/_cat/in...

3. 创立一个索引

创立一个名为 customer 的索引。pretty要求返回一个丑陋的json 后果

PUT /customer?pretty

再查看一下所有索引

http://localhost:9200/_cat/in...

GET /_cat/indices?v

4. 索引一个文档到customer索引中

curl -X PUT "localhost:9200/customer/_doc/1?pretty" -H 'Content-Type: application/json' -d'{  "name": "John Doe"}'

5. 从customer索引中获取指定id的文档

curl -X GET "localhost:9200/customer/_doc/1?pretty"

6. 查问所有文档

GET /customer/_search?q=*&sort=name:asc&pretty

JSON格局形式

GET /customer/_search{  "query": { "match_all": {} },  "sort": [    {"name": "asc" }  ]}

二、索引治理

1. 创立索引

创立一个名为twitter的索引，设置索引的分片数为3，备份数为2。留神：在ES中创立一个索引相似于在数据库中建设一个数据库(ES6.0之后相似于创立一个表)

PUT twitter{    "settings" : {        "index" : {            "number_of_shards" : 3,            "number_of_replicas" : 2        }    }}

阐明：

默认的分片数是5到1024
默认的备份数是1
索引的名称必须是小写的，不可重名

创立后果：

创立的命令还能够简写为

PUT twitter{    "settings" : {        "number_of_shards" : 3,        "number_of_replicas" : 2    }}

##### 2. 创立mapping映射

留神：在ES中创立一个mapping映射相似于在数据库中定义表构造，即表外面有哪些字段、字段是什么类型、字段的默认值等；也相似于solr外面的模式schema的定义

PUT twitter{    "settings" : {        "index" : {            "number_of_shards" : 3,            "number_of_replicas" : 2        }    },   "mappings" : {        "type1" : {            "properties" : {                "field1" : { "type" : "text" }            }        }    }}

3. 创立索引时退出别名定义

PUT twitter{    "aliases" : {        "alias_1" : {},        "alias_2" : {            "filter" : {                "term" : {"user" : "kimchy" }            },            "routing" : "kimchy"        }    }}

4. 创立索引时返回的后果阐明

5. Get Index 查看索引的定义信息

GET /twitter，能够一次获取多个索引（以逗号距离）获取所有索引_all或用通配符*

GET /twitter/_settings

GET /twitter/_mapping

6. 删除索引

DELETE /twitter

阐明：能够一次删除多个索引（以逗号距离）删除所有索引 _all 或通配符 *

7. 判断索引是否存在

HEAD twitter

HTTP status code 示意后果 404 不存在， 200 存在

8. 批改索引的settings信息

索引的设置信息分为动态信息和动静信息两局部。动态信息不可更改，如索引的分片数。动静信息能够批改。

REST 拜访端点：/_settings 更新所有索引的。{index}/_settings 更新一个或多个索引的settings。

具体的设置项请参考：https://www.elastic.co/guide/...

9. 批改备份数

PUT /twitter/_settings{    "index" : {        "number_of_replicas" : 2    }}

10. 设置回默认值，用null

PUT /twitter/_settings{    "index" : {        "refresh_interval" : null    }}

11. 设置索引的读写

index.blocks.read_only：设为true,则索引以及索引的元数据只可读index.blocks.read_only_allow_delete：设为true，只读时容许删除。index.blocks.read：设为true，则不可读。index.blocks.write：设为true，则不可写。index.blocks.metadata：设为true，则索引元数据不可读写。

12. 索引模板

在创立索引时，为每个索引写定义信息可能是一件繁琐的事件，ES提供了索引模板性能，让你能够定义一个索引模板，模板中定义好settings、mapping、以及一个模式定义来匹配创立的索引。

留神:模板只在索引创立时被参考，批改模板不会影响已创立的索引

12.1 新增/批改名为tempae_1的模板，匹配名称为te 或 bar的索引创立：

PUT _template/template_1{"index_patterns": ["te*", "bar*"],"settings": {  "number_of_shards": 1},"mappings": {  "type1": {    "_source": {      "enabled": false    },    "properties": {      "host_name": {        "type": "keyword"      },      "created_at": {        "type": "date",        "format": "EEE MMM dd HH:mm:ss Z YYYY"      }    }  }}}

12.2 查看索引模板

GET /_template/template_1GET /_template/temp* GET /_template/template_1,template_2GET /_template

12.3 删除模板

DELETE /_template/template_1

13. Open/Close Index 关上/敞开索引

POST /my_index/_closePOST /my_index/_open

阐明：

敞开的索引不能进行读写操作，简直不占集群开销。
敞开的索引能够关上，关上走的是失常的复原流程。

14. Shrink Index 膨胀索引

索引的分片数是不可更改的，如要缩小分片数能够通过膨胀形式膨胀为一个新的索引。新索引的分片数必须是原分片数的因子值，如原分片数是8，则新索引的分片数能够为4、2、1 。

什么时候须要膨胀索引呢? 最后创立索引的时候分片数设置得太大，前面发现用不了那么多分片，这个时候就须要膨胀了

膨胀的流程：

先把所有主分片都转移到一台主机上；
在这台主机上创立一个新索引，分片数较小，其余设置和原索引统一；
把原索引的所有分片，复制（或硬链接）到新索引的目录下；
对新索引进行关上操作复原分片数据；
(可选)从新把新索引的分片平衡到其余节点上。

膨胀前的筹备工作：

将原索引设置为只读；

将原索引各分片的一个正本重调配到同一个节点上，并且要是衰弱绿色状态。

PUT /my_source_index/_settings{"settings": {  <!-- 指定进行膨胀的节点的名称 -->  "index.routing.allocation.require._name": "shrink_node_name",  <!-- 阻止写，只读 -->   "index.blocks.write": true}}

进行膨胀：

POST my_source_index/_shrink/my_target_index{"settings": {  "index.number_of_replicas": 1,  "index.number_of_shards": 1,  "index.codec": "best_compression"}}

监控膨胀过程：

GET _cat/recovery?vGET _cluster/health

15. Split Index 拆分索引

当索引的分片容量过大时，能够通过拆分操作将索引拆分为一个倍数分片数的新索引。能拆分为几倍由创立索引时指定的index.number_of_routing_shards 路由分片数决定。这个路由分片数决定了依据一致性hash路由文档到分片的散列空间。

如index.number_of_routing_shards = 30 ，指定的分片数是5，则可按如下倍数形式进行拆分：

5 → 10 → 30 (split by 2, then by 3)5 → 15 → 30 (split by 3, then by 2)5 → 30 (split by 6)

为什么须要拆分索引？

当最后设置的索引的分片数不够用时就须要拆分索引了，和压缩索引相同

留神：只有在创立时指定了index.number_of_routing_shards 的索引才能够进行拆分，ES7开始将不再有这个限度。

和solr的区别是，solr是对一个分片进行拆分，es中是整个索引进行拆分。

拆分步骤：

筹备一个索引来做拆分：

PUT my_source_index{  "settings": {      "index.number_of_shards" : 1,      <!-- 创立时须要指定路由分片数 -->      "index.number_of_routing_shards" : 2  }}

先设置索引只读：

PUT /my_source_index/_settings{"settings": {  "index.blocks.write": true}}

做拆分：

POST my_source_index/_split/my_target_index{"settings": {  <!--新索引的分片数需合乎拆分规定-->  "index.number_of_shards": 2}}

监控拆分过程：
```
GET _cat/recovery?vGET _cluster/health
```
16. Rollover Index 别名滚动指向新创建的索引

对于有时效性的索引数据，如日志，过肯定工夫后，老的索引数据就没有用了。咱们能够像数据库中依据工夫创立表来寄存不同时段的数据一样，在ES中也可用建多个索引的形式来离开寄存不同时段的数据。比数据库中更不便的是ES中能够通过别名滚动指向最新的索引的形式，让你通过别名来操作时总是操作的最新的索引。

ES的rollover index API 让咱们能够依据满足指定的条件（工夫、文档数量、索引大小）创立新的索引，并把别名滚动指向新的索引。

留神：这时的别名只能是一个索引的别名。

Rollover Index 示例：创立一个名字为logs-0000001 、别名为logs_write 的索引：

PUT /logs-000001{  "aliases": {    "logs_write": {}  }}

增加1000个文档到索引logs-000001，而后设置别名滚动的条件

POST /logs_write/_rollover{  "conditions": {    "max_age":   "7d",    "max_docs":  1000,    "max_size":  "5gb"  }}

阐明：如果别名logs_write指向的索引是7天前（含）创立的或索引的文档数>=1000或索引的大小>= 5gb，则会创立一个新索引 logs-000002，并把别名logs_writer指向新创建的logs-000002索引

Rollover Index 新建索引的命名规定：

如果索引的名称是-数字结尾，如logs-000001，则新建索引的名称也会是这个模式，数值增1。
如果索引的名称不是-数值结尾，则在申请rollover api时需指定新索引的名称
```
POST /my_alias/_rollover/my_new_index_name{"conditions": {  "max_age":   "7d",  "max_docs":  1000,  "max_size": "5gb"}}
```
在名称中应用Date math（工夫表达式）

如果你心愿生成的索引名称中带有日期，如logstash-2016.02.03-1 ，则能够在创立索引时采纳工夫表达式来命名：

# PUT /<logs-{now/d}-1> with URI encoding:PUT /%3Clogs-%7Bnow%2Fd%7D-1%3E{  "aliases": {    "logs_write": {}  }}PUT logs_write/_doc/1{  "message": "a dummy log"} POST logs_write/_refresh# Wait for a day to passPOST /logs_write/_rollover{  "conditions": {    "max_docs":   "1"  }}

Rollover时可对新的索引作定义：

PUT /logs-000001{  "aliases": {    "logs_write": {}  }}POST /logs_write/_rollover{  "conditions" : {    "max_age": "7d",    "max_docs": 1000,    "max_size": "5gb"  },  "settings": {    "index.number_of_shards": 2  }}

Dry run 实际操作前先测试是否达到条件：

POST /logs_write/_rollover?dry_run{  "conditions" : {    "max_age": "7d",    "max_docs": 1000,    "max_size": "5gb"  }}

阐明：测试不会创立索引，只是检测条件是否满足

留神：rollover是你申请它才会进行操作，并不是主动在后盾进行的。你能够周期性地去申请它。

17. 索引监控

17.1 查看索引状态信息

官网链接：https://www.elastic.co/guide/...

查看所有的索引状态：

GET /_stats

查看指定索引的状态信息：

GET /index1,index2/_stats

17.2 查看索引段信息

官网链接：https://www.elastic.co/guide/...

GET /test/_segments GET /index1,index2/_segmentsGET /_segments

17.3 查看索引复原信息

官网链接：https://www.elastic.co/guide/...

GET index1,index2/_recovery?humanGET /_recovery?human

17.4 查看索引分片的存储信息

官网链接：https://www.elastic.co/guide/...

# return information of only index testGET /test/_shard_stores# return information of only test1 and test2 indicesGET /test1,test2/_shard_stores# return information of all indicesGET /_shard_stores  GET /_shard_stores?status=green

18. 索引状态治理

18.1 Clear Cache 清理缓存
```
POST /twitter/_cache/clear
```
默认会清理所有缓存，可指定清理query, fielddata or request 缓存
```
POST /kimchy,elasticsearch/_cache/clearPOST /_cache/clear
```

18.2 Refresh，从新关上读取索引

POST /kimchy,elasticsearch/_refreshPOST /_refresh

18.3 Flush，将缓存在内存中的索引数据刷新到长久存储中
```
POST twitter/_flush
```

18.4 Force merge 强制段合并

POST /kimchy/_forcemerge?only_expunge_deletes=false&max_num_segments=100&flush=true

可选参数阐明：

max_num_segments 合并为几个段，默认1
only_expunge_deletes 是否只合并含有删除文档的段，默认false
flush 合并后是否刷新，默认true

POST /kimchy,elasticsearch/_forcemergePOST /_forcemerge

三、映射详解

1. Mapping 映射是什么

映射定义索引中有什么字段、字段的类型等构造信息。相当于数据库中表构造定义，或 solr中的schema。因为lucene索引文档时须要晓得该如何来索引存储文档的字段。
ES中反对手动定义映射，动静映射两种形式。

1.1. 为索引创立mapping

 PUT test{<!--映射定义 -->"mappings" : {<!--名为type1的映射类别 mapping type-->      "type1" : {      <!-- 字段定义 -->          "properties" : {          <!-- 名为field1的字段，它的field datatype 为 text -->              "field1" : { "type" : "text" }          }      }  }}

阐明：映射定义后续能够批改

2. 映射类别 Mapping type 破除阐明

ES最先的设计是用索引类比关系型数据库的数据库，用mapping type 来类比表，一个索引中能够蕴含多个映射类别。这个类比存在一个重大的问题，就是当多个mapping type中存在同名字段时（特地是同名字段还是不同类型的），在一个索引中不好解决，因为搜索引擎中只有索引-文档的构造，不同映射类别的数据都是一个一个的文档（只是蕴含的字段不一样而已）

从6.0.0开始限定仅蕴含一个映射类别定义（ "index.mapping.single_type": true ），兼容5.x中的多映射类别。从7.0开始将移除映射类别。

为了与将来的布局匹配，请当初将这个惟一的映射类别名定义为“_doc”,因为索引的申请地址将标准为：

PUT {index}/_doc/{id} and POST {index}/_doc

Mapping 映射示例：

PUT twitter{  "mappings": {    "_doc": {      "properties": {        "type": { "type": "keyword" },        "name": { "type": "text" },        "user_name": { "type": "keyword" },        "email": { "type": "keyword" },        "content": { "type": "text" },        "tweeted_at": { "type": "date" }      }    }  }}

多映射类别数据转储到独立的索引中：

ES 提供了reindex API 来做这个事

3. 字段类型 datatypes

字段类型定义了该如何索引存储字段值。ES中提供了丰盛的字段类型定义，请查看官网链接具体理解每种类型的特点：

https://www.elastic.co/guide/...

3.1 Core Datatypes 外围类型

string  text and keywordNumeric datatypes  long, integer, short, byte, double, float, half_float, scaled_floatDate datatype  dateBoolean datatype  booleanBinary datatype  binaryRange datatypes     范畴  integer_range, float_range, long_range, double_range, date_range

3.2 Complex datatypes 复合类型

Array datatype  数组就是多值，不须要专门的类型Object datatype  object ：示意值为一个JSON 对象Nested datatype  nested：for arrays of JSON objects（示意值为JSON对象数组 ）

3.3 Geo datatypes 天文数据类型

Geo-point datatype  geo_point：for lat/lon points  （经纬坐标点）Geo-Shape datatype  geo_shape：for complex shapes like polygons （形态示意）

3.4 Specialised datatypes 特地的类型

IP datatype  ip：for IPv4 and IPv6 addressesCompletion datatype  completion：to provide auto-complete suggestionsToken count datatype  token_count：to count the number of tokens in a stringmapper-murmur3  murmur3：to compute hashes of values at index-time and store them in the indexPercolator type  Accepts queries from the query-dsljoin datatype  Defines parent/child relation for documents within the same index

4. 字段定义属性介绍

字段的type (Datatype)定义了如何索引存储字段值，还有一些属性能够让咱们依据须要来笼罩默认的值或进行特地定义。请参考官网介绍具体理解：https://www.elastic.co/guide/...

analyzer   指定分词器normalizer   指定标准化器boost        指定权重值coerce      强制类型转换copy_to    值复制给另一字段doc_values  是否存储docValuesdynamicenabled    字段是否可用fielddataeager_global_ordinalsformat    指定工夫值的格局ignore_aboveignore_malformedindex_optionsindexfieldsnormsnull_valueposition_increment_gappropertiessearch_analyzersimilaritystoreterm_vector

字段定义属性—示例

PUT my_index{  "mappings": {    "_doc": {      "properties": {        "date": {          "type":   "date",           <!--格式化日期 -->          "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"        }      }    }  }}

5. Multi Field 多重字段

当咱们须要对一个字段进行多种不同形式的索引时，能够应用fields多重字段定义。如一个字符串字段即须要进行text分词索引，也须要进行keyword 关键字索引来反对排序、聚合；或须要用不同的分词器进行分词索引。

示例：定义多重字段：

阐明：raw是一个多重版本名（自定义）

PUT my_index{  "mappings": {    "_doc": {      "properties": {        "city": {          "type": "text",          "fields": {            "raw": {              "type":  "keyword"            }          }        }      }    }  }}

往多重字段外面增加文档

PUT my_index/_doc/1{  "city": "New York"}PUT my_index/_doc/2{  "city": "York"}

获取多重字段的值：

GET my_index/_search{  "query": {    "match": {      "city": "york"    }  },  "sort": {    "city.raw": "asc"  },  "aggs": {    "Cities": {      "terms": {        "field": "city.raw"      }    }  }}

6. 元字段

官网链接：https://www.elastic.co/guide/...

元字段是ES中定义的文档字段，有以下几类：

7. 动静映射

动静映射：ES中提供的重要个性，让咱们能够疾速应用ES，而不须要先创立索引、定义映射。如咱们间接向ES提交文档进行索引：

PUT data/_doc/1{ "count": 5 }

ES将主动为咱们创立data索引、_doc 映射、类型为 long 的字段 count

索引文档时，当有新字段时， ES将依据咱们字段的json的数据类型为咱们主动加人字段定义到mapping中。

7.1 字段动静映射规定

7.2 Date detection 工夫侦测

所谓工夫侦测是指咱们往ES外面插入数据的时候会去自动检测咱们的数据是不是日期格局的，是的话就会给咱们主动转为设置的格局

date_detection 默认是开启的，默认的格局dynamic_date_formats为：

[ "strict_date_optional_time","yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"]PUT my_index/_doc/1{  "create_date": "2015/09/02"}GET my_index/_mapping

自定义工夫格局：

PUT my_index{  "mappings": {    "_doc": {      "dynamic_date_formats": ["MM/dd/yyyy"]    }  }}

禁用工夫侦测：

PUT my_index{  "mappings": {    "_doc": {      "date_detection": false    }  }}

7.3 Numeric detection 数值侦测

开启数值侦测（默认是禁用的）

PUT my_index{  "mappings": {    "_doc": {      "numeric_detection": true    }  }}PUT my_index/_doc/1{  "my_float":   "1.0",  "my_integer": "1"}

四、索引别名

1. 别名的用处

如果心愿一次查问可查问多个索引。
如果心愿通过索引的视图来操作索引，就像数据库库中的视图一样。
索引的别名机制，就是让咱们能够以视图的形式来操作集群中的索引，这个视图可是多个索引，也可是一个索引或索引的一部分。

2. 新建索引时定义别名

PUT /logs_20162801{    "mappings" : {        "type" : {            "properties" : {                "year" : {"type" : "integer"}            }        }    },    <!-- 定义了两个别名 -->    "aliases" : {        "current_day" : {},        "2016" : {            "filter" : {                "term" : {"year" : 2016 }            }        }    }}

3. 创立别名 /_aliases

为索引test1创立别名alias1

POST /_aliases{    "actions" : [        { "add" : { "index" : "test1", "alias" : "alias1" } }    ]}

4. 删除别名

POST /_aliases{    "actions" : [        { "remove" : { "index" : "test1", "alias" : "alias1" } }    ]}

还能够这样写

DELETE /{index}/_alias/{name}

5. 批量操作别名

删除索引test1的别名alias1，同时为索引test2增加别名alias1

POST /_aliases{    "actions" : [        { "remove" : { "index" : "test1", "alias" : "alias1" } },        { "add" : { "index" : "test2", "alias" : "alias1" } }    ]}

6. 为多个索引定义一样的别名

形式1：

POST /_aliases{  "actions" : [      { "add" : { "index" : "test1", "alias" : "alias1" } },      { "add" : { "index" : "test2", "alias" : "alias1" } }  ]}

形式2：
```
POST /_aliases{  "actions" : [      { "add" : { "indices" : ["test1", "test2"], "alias" : "alias1" } }  ]}
```
留神：只可通过多索引别名进行搜寻，不可进行文档索引和依据id获取文档。
形式3：通过统配符*模式来指定要别名的索引
```
POST /_aliases{  "actions" : [      { "add" : { "index" : "test*", "alias" : "all_test_indices" } }  ]}
```
留神：在这种状况下，别名是一个点工夫别名，它将对所有匹配的以后索引进行别名，当增加/删除与此模式匹配的新索引时，它不会自动更新。

7. 带过滤器的别名

索引中须要有字段

PUT /test1{  "mappings": {    "type1": {      "properties": {        "user" : {          "type": "keyword"        }      }    }  }}

过滤器通过Query DSL来定义，将作用于通过该别名来进行的所有Search, Count, Delete By Query and More Like This 操作。

POST /_aliases{    "actions" : [        {            "add" : {                 "index" : "test1",                 "alias" : "alias2",                 "filter" : { "term" : { "user" : "kimchy" } }            }        }    ]}

8. 带routing的别名

可在别名定义中指定路由值，可和filter一起应用，用来限定操作的分片，防止不须要的其余分片操作。

POST /_aliases{    "actions" : [        {            "add" : {                 "index" : "test",                 "alias" : "alias1",                 "routing" : "1"            }        }    ]}

为搜寻、索引指定不同的路由值

POST /_aliases{    "actions" : [        {            "add" : {                 "index" : "test",                 "alias" : "alias2",                 "search_routing" : "1,2",                 "index_routing" : "2"            }        }    ]}

9. 以PUT形式来定义一个别名

PUT /{index}/_alias/{name}PUT /logs_201305/_alias/2013

带filter 和 routing

PUT /users{    "mappings" : {        "user" : {            "properties" : {                "user_id" : {"type" : "integer"}            }        }    }}

PUT /users/_alias/user_12{    "routing" : "12",    "filter" : {        "term" : {            "user_id" : 12        }    }}

10. 查看别名定义信息

GET /{index}/_alias/{alias}GET /logs_20162801/_alias/*GET /_alias/2016GET /_alias/20*

原文：https://blog.csdn.net/ZYC8888...

一、疾速入门

1.查看集群的健康状况

2. 查看所有索引

3. 创立一个索引

4. 索引一个文档到customer索引中

5. 从customer索引中获取指定id的文档

6. 查问所有文档

二、索引治理

1. 创立索引

3. 创立索引时退出别名定义

4. 创立索引时返回的后果阐明

5. Get Index 查看索引的定义信息

6. 删除索引

7. 判断索引是否存在

8. 批改索引的settings信息

9. 批改备份数

10. 设置回默认值，用null

11. 设置索引的读写

12. 索引模板

13. Open/Close Index 关上/敞开索引

14. Shrink Index 膨胀索引

15. Split Index 拆分索引

拆分步骤：

16. Rollover Index 别名滚动指向新创建的索引

17. 索引监控

18. 索引状态治理

三、映射详解

1. Mapping 映射是什么

2. 映射类别 Mapping type 破除阐明

3. 字段类型 datatypes

4. 字段定义属性介绍

5. Multi Field 多重字段

6. 元字段

7. 动静映射

四、索引别名

1. 别名的用处

2. 新建索引时定义别名

3. 创立别名 /_aliases

4. 删除别名

5. 批量操作别名

6. 为多个索引定义一样的别名

7. 带过滤器的别名

8. 带routing的别名

9. 以PUT形式来定义一个别名

10. 查看别名定义信息