关于java:分布式搜索引擎ElasticSearch之高级运用一

一、过滤查问（分页、含糊、filter）

1. 搜寻合乎匹配条件的信息：

创立数据：

PUT account/_doc/1
{ "account": 10001, "balance": 10000, "name": "test1"} 

PUT account/_doc/2
{ "account": 10002, "balance": 20000, "name": "test2"} 

PUT account/_doc/3
{ "account": 10003, "balance": 30000, "name": "张三"} 

PUT account/_doc/4
{ "account": 10004, "balance": 30000, "name": "王五"}

依据账号编号查找：

GET /account/_search 
{
  "query": { 
    "match": {
      "accountNo": "10001"
    }
  }
}

返回后果：

{
  ...
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "account",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "account" : 10002,
          "balance" : 20000,
          "name" : "test2"
        }
      }
    ]
  }
  ...
}

匹配胜利，返回所要查问的数据。

2. 反对分页查问：

GET /account/_search 
{
  "query": { 
    "match_all": {}
  },
  "from": 0,
  "size": 2
}

可能返回2条数据。

"hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "account",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "account" : 10001,
          "balance" : 10000,
          "name" : "test1"
        }
      },
      {
        "_index" : "account",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "account" : 10002,
          "balance" : 20000,
          "name" : "test2"
        }
      }
    ]
  }

3. 含糊查问：

数值类型不利于含糊匹配，这里通过字符类型进行测试：

GET /account/_search 
{
  "query": { 
    "match": {
      "name": "三四"
    }
  }
}

返回后果：

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "account",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 0.2876821,
        "_source" : {
          "accountNo" : 10009,
          "balance" : 1000000,
          "name" : "张三"
        }
      }
    ]
  }
}

留神，这里默认会采纳单个汉字分词，所查问的关键字“三四”会拆成“三”和“四”进行含糊匹配。

4. filter过滤查问：

GET /account/_search 
{
  "query": { 
    "bool": {
      "filter": [
        {
          "term": {
            "name": "张三"
          }
        }
      ]
    }
  }
}

term是精准查问，代表齐全匹配，不须要查问评分计算。

返回后果：

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

能够看到没有匹配到任何后果，因为term是拿整个词“张三”进行匹配，而ES默认是做单字分词，将“张三”划分为了“张”和“三”，所以匹配不到后果。

二、bool查问（should、must）

should查问：只有其中一个为true则成立。

GET /movies/_search
{
  "query":{
   "bool": {
     "must": [
       {"match": {"title": "good hearts sea"}},
       {"match": {"overview": "good hearts sea"}}
     ]
   }
  }
}

must查问：必须所有条件都成立。

GET /movies/_search
{

  "query":{
   "bool": {
     "must": [
       {"match": {"title": "good hearts sea"}},
       {"match": {"overview": "good hearts sea"}}
     ]
   }
  }
}

must_not查问：必须所有条件都不成立。

GET /movies/_search
{

  "query":{
   "bool": {
     "must_not": [
       {"match": {"title": "good hearts sea"}},
       {"match": {"overview": "good hearts sea"}}
     ]
   }
  }
}

三、聚合查问操作（aggs）

依据用户的资金balance来做分组统计：

GET /account/_search 
{
  "query": { 
    "bool": {
      "filter": [
        {
          "range": {
            "account": {
              "gte": 10001
            }
          }
        }
      ]
    }
  },
  "sort": [
    {
      "balance": {
        "order": "desc"
      }
    }
  ],
  "aggs":{
    "group_by_balance": {
      "terms": {
        "field": "balance"
      }
    }
  }
}

找出账户编号大于等于10001的数据，依据balance做倒序排列，采纳aggs依据balance做分组汇总统计：

"aggregations" : {
    "group_by_balance" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 30000,
          "doc_count" : 2
        },
        {
          "key" : 10000,
          "doc_count" : 1
        },
        {
          "key" : 20000,
          "doc_count" : 1
        }
      ]
    }
  }

能够看到，最初会输入分组统计的汇总信息。

本文由mirson创作分享，感激大家的反对，心愿对大家有所播种!
入群申请，请加WX号：woodblock99

关于java:分布式搜索引擎ElasticSearch之高级运用一

一、过滤查问（分页、含糊、filter）

二、bool查问（should、must）

三、聚合查问操作（aggs）

评论

发表回复取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

关于java:分布式搜索引擎ElasticSearch之高级运用一

一、过滤查问（分页、含糊、filter）

二、bool查问（should、must）

三、聚合查问操作（aggs）

评论

发表回复 取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

发表回复取消回复