关于elasticsearch:logstash导入movielens测试数据

1. movielens数据

https://grouplens.org/dataset…
学习训练,应用最小数据集即可:
(ml-latest-small)[https://files.grouplens.org/d…]

2. logstash配置文件:

在logstash/conf目录下拷贝一份logstash-sample.conf文件, 命名为:logstash-movies.conf,内容如下:

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
  file {
    path => "/export/_backup/elk_bak/ml-latest-small/movies.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {
  csv {
    separator => ","
    columns => ["id", "content", "genre"]
  }
  
  mutate {
    split => { "genre" => "|"}
    remove_field => ["path", "host", "@timestamp", "message"]
  }
  
  mutate {
    split => { "content" => "(" }
    add_field => { "title" => "%{[content][0]}"}
    add_field => { "year" => "%{[content][1]}"}
  }
  
  mutate {
    convert => {
      "year" => "integer"
    }
    strip => ["title"]
    remove_field => ["path", "host", "@timestamp", "content"]
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "movies"
    document_id => "%{id}"
    #user => "user"
    #password => "password"
  }
  stdout {}
}

3. 执行导入

bin/logstash -f config config/logstash-movies.conf
执行须要等一会!
而后控制台输入内容,如下

......
{
          "id" => "193609",
       "genre" => [
        [0] "Comedy"
    ],
       "title" => "Andrew Dice Clay: Dice Rules",
    "@version" => "1",
        "year" => 1991
}

待控制台不再输入,ctrl+c进行即可

4. kibana检查数据是否导入index

index治理中呈现所导入的索引,即胜利!

评论

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

这个站点使用 Akismet 来减少垃圾评论。了解你的评论数据如何被处理