Elasticsearch-及-IK-中文分词插件安装

Elasticsearch 及 IK 中文分词插件安装一、安装Java并配置 JAVA_HOME 环境变量由于Elasticsearch是使用Java构建的,所以首先需要安装 Java 8 或更高版本 才能运行。所有Elasticsearch节点和客户机上都应该使用相同的JVM版本。1. 安装Java根据不同的系统,从 https://www.oracle.com/techne... 下载相应Java版本进行安装。 CentOS安装Java示例下载Java RPM安装包,笔者这里下载的是 jdk-12.0.1_linux-x64_bin.rpm使用 rpm -ivh jdk-12.0.1_linux-x64_bin.rpm 命令进行安装。Preparing... ################################# [100%]Updating / installing... 1:jdk-12.0.1-2000:12.0.1-ga ################################# [100%]Ubuntu安装Java示例下载Java DEB安装包使用 dpkg -i jdk-12.0.1_linux-x64_bin.deb 命令进行安装。Ubuntu还可以参考 How To Install Java with Apt-Get on Ubuntu 16.04 安装Java 2. 配置 JAVA_HOME定位JDK安装路径which java[root/usr/local/src] ]$which java/usr/bin/javals -l /usr/bin/java[root/usr/local/src] ]$ls -l /usr/bin/javalrwxrwxrwx 1 root root 22 Jul 5 17:54 /usr/bin/java -> /etc/alternatives/javals -l /etc/alternatives/java[root/usr/local/src] ]$ls -l /etc/alternatives/javalrwxrwxrwx 1 root root 29 Jul 5 17:54 /etc/alternatives/java -> /usr/java/jdk-12.0.1/bin/java此时,我们可以确定java的安装目录为: /usr/java/jdk-12.0.1 ...

July 8, 2019 · 2 min · jiezi

配置elasticsearch6.5.4-ik分词插件安装,测试,扩展字典

elasticsearch基本配置上篇已经简单介绍过,本文讲述配置ik分词器插件的安装,测试,自定义扩展字典,简单使用。希望能帮助后来者少走点弯路。注意:ik分词器必须保证和elasticsearch版本一致,配置完成之后可以设置默认的分词工具,也可以在创建索引文件时使用ik分词工具1. elasticsearch-ik分词环境必须跟elasticsearch一致我的elasticsearch版本是elasticsearch-v6.5.4,所以需要下载的ik分词器版本是elasticsearch-ik-v6.5.4下载文件:wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.tar.gz wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.5.4/elasticsearch-analysis-ik-6.5.4.zip 进入elasticsearch的安装目录,解压ik分词器内文件到plugin目录:root @ localhost in /data/elasticsearch-6.5.4 [17:18:23]$ l总用量 4.8Mdrwxrwxr-x 9 euser euser 198 12月 11 11:26 .drwxr-xr-x. 7 root root 90 1月 16 16:35 ..drwxrwxr-x 3 euser euser 4.0K 12月 11 11:13 bindrwxrwxr-x 2 euser euser 178 12月 11 11:32 configdrwxrwxr-x 3 euser euser 19 12月 11 11:25 data-rwxrwxr-x 1 euser euser 4.3M 12月 6 22:30 elasticsearch-analysis-ik-6.5.4.zipdrwxrwxr-x 3 euser euser 4.0K 11月 30 08:02 lib-rwxrwxr-x 1 euser euser 14K 11月 30 07:55 LICENSE.txtdrwxrwxrwx 2 euser euser 8.0K 2月 11 01:30 logsdrwxrwxr-x 28 euser euser 4.0K 11月 30 08:02 modules-rwxrwxr-x 1 euser euser 395K 11月 30 08:01 NOTICE.txtdrwxrwxr-x 3 euser euser 25 12月 11 11:29 plugins-rwxrwxr-x 1 euser euser 8.4K 11月 30 07:55 README.textile进入到plugin目录,创建文件夹mkdir analysis-ik/ 解压ik分词器中的文件到analysis-ik目录:# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4 [18:01:37] $ cd plugins/# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4 [18:01:37] $ mkdir analysis-ik# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4 [18:01:37] $ mv ../../../analysis-ik analysis-ik# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4/plugins [18:04:29] $ lsanalysis-ik# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4/plugins [18:04:34] $ ls -l ./analysis-ik/total 1432-rw-r–r– 1 root root 263965 May 6 2018 commons-codec-1.9.jar-rw-r–r– 1 root root 61829 May 6 2018 commons-logging-1.2.jardrwxr-xr-x 2 root root 4096 Aug 26 17:52 config-rw-r–r– 1 root root 54693 Dec 23 11:26 elasticsearch-analysis-ik-6.5.4.jar-rw-r–r– 1 root root 736658 May 6 2018 httpclient-4.5.2.jar-rw-r–r– 1 root root 326724 May 6 2018 httpcore-4.4.4.jar-rw-r–r– 1 root root 1805 Dec 23 11:26 plugin-descriptor.properties-rw-r–r– 1 root root 125 Dec 23 11:26 plugin-security.policy配置默认分词工具为ik分词,在ElasticSearch的配置文件config/elasticsearch.yml中的最后一行添加参数:index.analysis.analyzer.default.type:ik(则设置所有索引的默认分词器为ik分词,也可以不这么做,通过设置mapping来使用ik分词)# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4 [18:33:16] $ cd config/# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4/config [18:33:21] $ echo “index.analysis.analyzer.default.type:ik” >> elasticsearch.yml2. 启动eleasticsearch,并测试ik分词方便测试,以普通模式启动:./bin/elasticsearch创建索引文件:curl -XPUT http://localhost:9200/class使用ik分词,查看效果:curl -XGET -H “Content-Type: application/json” ‘http://localhost:9200/class/_analyze?pretty’ -d ‘{“analyzer”: “ik_max_word”,“text”: “我是中国人,我爱我的祖国和人民”}’{ “tokens” : [ { “token” : “我”, “start_offset” : 0, “end_offset” : 1, “type” : “CN_CHAR”, “position” : 0 }, { “token” : “是”, “start_offset” : 1, “end_offset” : 2, “type” : “CN_CHAR”, “position” : 1 }, { “token” : “中国人”, “start_offset” : 2, “end_offset” : 5, “type” : “CN_WORD”, “position” : 2 }, { “token” : “中国”, “start_offset” : 2, “end_offset” : 4, “type” : “CN_WORD”, “position” : 3 }, { “token” : “国人”, “start_offset” : 3, “end_offset” : 5, “type” : “CN_WORD”, “position” : 4 }, { “token” : “我”, “start_offset” : 6, “end_offset” : 7, “type” : “CN_CHAR”, “position” : 5 }, { “token” : “爱我”, “start_offset” : 7, “end_offset” : 9, “type” : “CN_WORD”, “position” : 6 }, { “token” : “的”, “start_offset” : 9, “end_offset” : 10, “type” : “CN_CHAR”, “position” : 7 }, { “token” : “祖国”, “start_offset” : 10, “end_offset” : 12, “type” : “CN_WORD”, “position” : 8 }, { “token” : “和”, “start_offset” : 12, “end_offset” : 13, “type” : “CN_CHAR”, “position” : 9 }, { “token” : “人民”, “start_offset” : 13, “end_offset” : 15, “type” : “CN_WORD”, “position” : 10 } ]}3. 测试完成之后以守护进程启动./data/elasticsearch/bin/elasticsearch -d待续 …… ...

February 22, 2019 · 3 min · jiezi