1. 题目形容
有一个 index=index_a, 只有一列 title;
请以此 index_a 为根底, 保留 title;
减少 len 列, 内容为 title 列的长度;
减少 split_title 列, 内容为 title 列用空格宰割的数组;
2. 题目筹备
PUT /index_a/_doc/1
{"title": "Thinking in java 4th"}
3. 创立 ingest pipeline
创立一个名为 pipeline_a 的 pipeline
PUT _ingest/pipeline/pipeline_a
{
"processors": [
{
"script": { ## 3.1 script
"source": "ctx.len=ctx.title.length();"}
},
{
"set": { ## 3.2 set
"field": "split_title",
"value": ""
}
},
{
"split": { ## 3.3 split
"field": "title",
"separator": "","target_field":"split_title"
}
}
]
}
这个 pipeline 的创立里, 应用了 pipeline 的 3 个 processor, 别离如下:
3.1 script
script 给 index 减少一个 len 字段, 值为 title 字段的长度 ctx.len=ctx.title.length();
3.2 set
split_title 给 index 减少了一个字段 split_title, 值设置为空字符串
3.3 split
split 给 index 做一个 split 解决, 输出指标是 title
字段, 输入到字段 split_title
上
因为 processor 是一个一个流水执行的, 下一个, 能够用到上一个的, 所以会正确达到咱们预期
4. reindex 并应用 pipeline
POST _reindex
{
"source": {"index": "index_a"},
"dest": {
"index": "index_b",
"op_type": "create",
"pipeline": "pipeline_a"
}
}
5. 验证
GET /index_b/_search
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "index_b",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"len" : 20,
"split_title" : [
"Thinking",
"in",
"java",
"4th"
],
"title" : "Thinking in java 4th"
}
}
]
}
}