Gremlin入门

jiezi

6 年前

Gremlin 入门
一、Gremlin 简介
Gremlin 是 Apache ThinkerPop 框架下的图遍历语言，Gremlin 是一种函数式数据流语言，可以使用户使用简洁的方式表述复杂的属性图的遍历或查询。每个 Gremlin 遍历由一系列步骤（可能存在嵌套）组成，每一步都在数据流（data stream）上执行一个原子操作。
Gremlin 语言包括三个基本的操作：

map-step：对数据流中的对象进行转换；
filter-step：对数据流中的对象就行过滤；
sideEffect-step：对数据流进行计算统计；

Tinkerpop3 模型核心概念

Graph: 维护节点 & 边的集合，提供访问底层数据库功能，如事务功能
Element: 维护属性集合，和一个字符串 label，表明这个 element 种类
Vertex: 继承自 Element，维护了一组入度，出度的边集合
Edge: 继承自 Element，维护一组入度，出度 vertex 节点集合.
Property: kv 键值对
VertexProperty: 节点的属性，有一组健值对 kv，还有额外的 properties 集合。同时也继承自 element，必须有自己的 id, label.
Cardinality:「single, list, set」节点属性对应的 value 是单值，还是列表，或者 set。

二、Gremlin 查询示例
先介绍一下图中比较核心的几个概念：

Schema：Schema 是一种描述语言，这里就是指所有属性和类型的集合，包括边和点的属性，边和点的 Label 等；
属性类型（PropertyKey）：只边和点可以使用的属性类型；
顶点类型（VertexLabel）：顶点的类型，比如 User，Car 等；
边类型（EdgeLabel）：边的类型，比如 know，use 等；
顶点（Vertex）：就是图中的顶点，代表图中的一个节点；
边（Edge）：就是图中的边，连接两个节点，分为有向边和无向边；

创建属性类型
graph.schema().propertyKey(“name”).asText().ifNotExist().create()
graph.schema().propertyKey(“age”).asInt().ifNotExist().create()
graph.schema().propertyKey(“city”).asText().ifNotExist().create()
graph.schema().propertyKey(“lang”).asText().ifNotExist().create()
graph.schema().propertyKey(“date”).asText().ifNotExist().create()
graph.schema().propertyKey(“price”).asInt().ifNotExist().create()
创建顶点类型
person = graph.schema().vertexLabel(“person”).properties(“name”, “age”, “city”).primaryKeys(“name”).ifNotExist().create()
software = graph.schema().vertexLabel(“software”).properties(“name”, “lang”, “price”).primaryKeys(“name”).ifNotExist().create()
创建边类型
knows = graph.schema().edgeLabel(“knows”).sourceLabel(“person”).targetLabel(“person”).properties(“date”).ifNotExist().create()
created = graph.schema().edgeLabel(“created”).sourceLabel(“person”).targetLabel(“software”).properties(“date”, “city”).ifNotExist().create()
创建顶点和边
marko = graph.addVertex(T.label, “person”, “name”, “marko”, “age”, 29, “city”, “Beijing”)
vadas = graph.addVertex(T.label, “person”, “name”, “vadas”, “age”, 27, “city”, “Hongkong”)
lop = graph.addVertex(T.label, “software”, “name”, “lop”, “lang”, “java”, “price”, 328)
josh = graph.addVertex(T.label, “person”, “name”, “josh”, “age”, 32, “city”, “Beijing”)
ripple = graph.addVertex(T.label, “software”, “name”, “ripple”, “lang”, “java”, “price”, 199)
peter = graph.addVertex(T.label, “person”,”name”, “peter”, “age”, 29, “city”, “Shanghai”)

marko.addEdge(“knows”, vadas, “date”, “20160110”)
marko.addEdge(“knows”, josh, “date”, “20130220”)
marko.addEdge(“created”, lop, “date”, “20171210”, “city”, “Shanghai”)
josh.addEdge(“created”, ripple, “date”, “20151010”, “city”, “Beijing”)
josh.addEdge(“created”, lop, “date”, “20171210”, “city”, “Beijing”)
peter.addEdge(“created”, lop, “date”, “20171210”, “city”, “Beijing”)
展示图
g.V() // 创建使用 graph，查询使用 g，其实 g 就是 graph.traversal()
查询点
g.V().limit(5) // 查询所有点，但限制点的返回数量为 5，也可以使用 range(x, y) 的算子，返回区间内的点数量。
g.V().hasLabel(‘person’) // 查询点的 label 值为 ’person’ 的点。
g.V(’11’) // 查询 id 为‘11’的点。
查询边
g.E() // 查询所有边，不推荐使用，边数过大时，这种查询方式不合理，一般需要添加过滤条件或限制返回数量。
g.E(’55-81-5′) // 查询边 id 为‘55-81-5’的边。
g.E().hasLabel(‘knows’) // 查询 label 为‘knows’的边。
g.V(’46’).outE(‘knows’) // 查询点 id 为‘46’所有 label 为‘knows’的边。
查询属性
g.V().limit(3).valueMap() // 查询点的所有属性（可填参数，表示只查询该点，一个点所有属性一行结果）。
g.V().limit(1).label() // 查询点的 label。
g.V().limit(10).values(‘name’) // 查询点的 name 属性（可不填参数，表示查询所有属性，一个点每个属性一行结果，只有 value，没有 key）。
删除点
g.V(‘600’).drop() // 删除 ID 为 600 的点。
删除边
g.E(‘501-502-0’).drop() // 删除 ID 为“501-502-0”的边。
查询二度好友和共同好友数
// 查询一度好友
g.V(‘1500771’).out()
// 查询二度好友
g.V(‘1500771’).out().out().dedup().not(hasId(‘1500771’))
// 查询共同好友数
g.V(‘1500771’).out().out().hasId(‘2165197’).path().simplePath().count()

此外，还有查询，遍历，过滤，路径，迭代，转换，排序，逻辑，统计，分支等语法，可以参考：http://tang.love/2018/11/15/g…。
参考：http://tang.love/2018/11/15/g…
https://hugegraph.github.io/h…