1. movielens 数据
https://grouplens.org/dataset…
学习训练,应用最小数据集即可:
(ml-latest-small)[https://files.grouplens.org/d…]
2. logstash 配置文件:
在 logstash/conf 目录下拷贝一份 logstash-sample.conf 文件,命名为:logstash-movies.conf,内容如下:
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
input {
file {
path => "/export/_backup/elk_bak/ml-latest-small/movies.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["id", "content", "genre"]
}
mutate {split => { "genre" => "|"}
remove_field => ["path", "host", "@timestamp", "message"]
}
mutate {split => { "content" => "("}
add_field => {"title" => "%{[content][0]}"}
add_field => {"year" => "%{[content][1]}"}
}
mutate {
convert => {"year" => "integer"}
strip => ["title"]
remove_field => ["path", "host", "@timestamp", "content"]
}
}
output {
elasticsearch {hosts => ["http://localhost:9200"]
index => "movies"
document_id => "%{id}"
#user => "user"
#password => "password"
}
stdout {}}
3. 执行导入
bin/logstash -f config config/logstash-movies.conf
执行须要等一会!
而后控制台输入内容,如下
......
{
"id" => "193609",
"genre" => [[0] "Comedy"
],
"title" => "Andrew Dice Clay: Dice Rules",
"@version" => "1",
"year" => 1991
}
待控制台不再输入,ctrl+ c 进行即可
4. kibana 检查数据是否导入 index
index 治理中呈现所导入的索引,即胜利!