Supported Gremlin Steps#

  1. Introduction

  2. Standard Steps

    1. Source

    2. Expand

    3. Filter

    4. Project

    5. Aggregate

    6. Order

    7. Statistics

    8. Union

    9. Match

    10. Subgraph

    11. Identity

    12. Unfold

  3. Syntactic Sugars

    1. PathExpand

    2. Expression

    3. Aggregate(Group)

  4. Limitations

Introduction#

This documentation guides you how to work with the Gremlin graph traversal language in GraphScope. On the one hand we retain the original syntax of most steps from the standard Gremlin, on the other hand the usages of some steps are further extended to denote more complex situations in real-world scenarios.

Standard Steps#

We retain the original syntax of the following steps from the standard Gremlin.

Source#

V()#

The V()-step is meant to iterate over all vertices from the graph. Moreover, vertexIds can be injected into the traversal to select a subset of vertices.

Parameters:
vertexIds - to select a subset of vertices from the graph, each id is of integer type.

g.V()
g.V(1)
g.V(1,2,3)

E()#

The E()-step is meant to iterate over all edges from the graph. Moreover, edgeIds can be injected into the traversal to select a subset of edges.

Parameters:
edgeIds - to select a subset of edges from the graph, each id is of integer type.

g.E()
g.E(1)
g.E(1,2,3)

Expand#

outE()#

Map the vertex to its outgoing incident edges given the edge labels.

Parameters:
edgeLabels - the edge labels to traverse.

g.V().outE("knows")
g.V().outE("knows", "created")

inE()#

Map the vertex to its incoming incident edges given the edge labels.

Parameters:
edgeLabels - the edge labels to traverse.

g.V().inE("knows")
g.V().inE("knows", "created")

bothE()#

Map the vertex to its incident edges given the edge labels.

Parameters:
edgeLabels - the edge labels to traverse.

g.V().bothE("knows")
g.V().bothE("knows", "created")

out()#

Map the vertex to its outgoing adjacent vertices given the edge labels.

Parameters:
edgeLabels - the edge labels to traverse.

g.V().out("knows")
g.V().out("knows", "created")

in()#

Map the vertex to its incoming adjacent vertices given the edge labels.

Parameters:
edgeLabels - the edge labels to traverse.

g.V().in("knows")
g.V().in("knows", "created")

both()#

Map the vertex to its adjacent vertices given the edge labels.

Parameters:
edgeLabels - the edge labels to traverse.

g.V().both("knows")
g.V().both("knows", "created")

outV()#

Map the edge to its outgoing/tail incident vertex.

g.V().inE().outV() # = g.V().in()

inV()#

Map the edge to its incoming/head incident vertex.

g.V().outE().inV() # = g.V().out()

otherV()#

Map the edge to the incident vertex that was not just traversed from in the path history.

g.V().bothE().otherV() # = g.V().both()

bothV()#

Map the edge to its incident vertices.

g.V().outE().bothV() # both endpoints of the outgoing edges

Filter#

hasId()#

The hasId()-step is meant to filter graph elements based on their identifiers.

Parameters:
elementIds - identifiers of the elements.

g.V().hasId(1) # = g.V(1)
g.V().hasId(1,2,3) # = g.V(1,2,3)

hasLabel()#

The hasLabel()-step is meant to filter graph elements based on their labels.

Parameters:
labels - labels of the elements.

g.V().hasLabel("person")
g.V().hasLabel("person", "software")

has()#

The has()-step is meant to filter graph elements by applying predicates on their properties.

Parameters:

  • propertyKey - the key of the property to filter on for existence.

    g.V().has("name") # find vertices containing property `name`
    
  • propertyKey - the key of the property to filter on,
    value - the value to compare the accessor value to for equality.

    g.V().has("age", 10)
    g.V().has("name", "marko")
    g.E().has("weight", 1.0)
    
  • propertyKey - the key of the property to filter on,
    predicate - the filter to apply to the key’s value.

    g.V().has("age", P.eq(10))
    g.V().has("age", P.neq(10))
    g.V().has("age", P.gt(10))
    g.V().has("age", P.lt(10))
    g.V().has("age", P.gte(10))
    g.V().has("age", P.lte(10))
    g.V().has("age", P.within([10, 20]))
    g.V().has("age", P.without([10, 20]))
    g.V().has("age", P.inside(10, 20))
    g.V().has("age", P.outside(10, 20))
    g.V().has("age", P.not(P.eq(10))) # = g.V().has("age", P.neq(10))
    g.V().has("name", TextP.startingWith("mar"))
    g.V().has("name", TextP.endingWith("rko"))
    g.V().has("name", TextP.containing("ark"))
    g.V().has("name", TextP.notStartingWith("mar"))
    g.V().has("name", TextP.notEndingWith("rko"))
    g.V().has("name", TextP.notContaining("ark"))
    
  • label - the label of the Element,
    propertyKey - the key of the property to filter on,
    value - the value to compare the accessor value to for equality.

    g.V().has("person", "id", 1) # = g.V().hasLabel("person").has("id", 1)
    
  • label - the label of the Element,
    propertyKey - the key of the property to filter on,
    predicate - the filter to apply to the key’s value.

    g.V().has("person", "age", P.eq(10)) # = g.V().hasLabel("person").has("age", P.eq(10))
    

hasNot()#

The hasNot()-step is meant to filter graph elements based on the non-existence of properties.

Parameters:
propertyKey - the key of the property to filter on for non-existence.

g.V().hasNot("age") # find vertices not-containing property `age`

is()#

The is()-step is meant to filter the object if it is unequal to the provided value or fails the provided predicate.

Parameters:

  • value - the value that the object must equal.

    g.V().out().count().is(1)
    
  • predicate - the filter to apply.

    g.V().out().count().is(P.eq(1))
    

where(traversal)#

The where(traversal)-step is meant to filter the current object by applying it to the nested traversal.

Parameters:
whereTraversal - the traversal to apply.

g.V().where(out().count())
g.V().where(out().count().is(gt(0)))

where(predicate)#

The where(predicate)-step is meant to filter the traverser based on the predicate acting on different tags.

Parameters:

  • predicate - the predicate containing another tag to apply.

    # is the current entry equal to the entry referred by `a`?
    g.V().as("a").out().out().where(P.eq("a"))
    
  • startKey - the tag containing the object to filter,
    predicate - the predicate containing another tag to apply.

    # is the entry referred by `b` equal to the entry referred by `a`?
    g.V().as("a").out().out().as("b").where("b", P.eq("a"))
    

The by() can be applied to a number of different steps to alter their behaviors. Here are some usages of the modulated by()-step after a where-step:

  • empty - this form is essentially an identity() modulation.

     # = g.V().as("a").out().out().as("b").where("b", P.eq("a"))
    g.V().as("a").out().out().as("b").where("b", P.eq("a")).by()
    
  • propertyKey - filter by the property value of the specified tag given the property key.

    # whether entry `b` and entry `a` have the same property value of `name`?
    g.V().as("a").out().out().as("b").where("b", P.eq("a")).by("name")
    
  • traversal - filter by the computed value after applying the specified tag to the nested traversal.

    # whether entry `b` and entry `a` have the same count of one-hop neighbors?
    g.V().as("a").out().out().as("b").where("b", P.eq("a")).by(out().count())
    

not(traversal)#

The not()-step is opposite to the where()-step and removes objects from the traversal stream when the traversal provided as an argument does not return any objects.

Parameters:
notTraversal - the traversal to filter by.

g.V().not(out().count())
g.V().not(out().count().is(gt(0)))

dedup()#

Remove all duplicates in the traversal stream up to this point.

Parameters: dedupLabels - composition of the given labels determines de-duplication. No labels implies current object.

g.V().dedup()
g.V().as("a").out().dedup("a") # dedup by entry `a`
g.V().as("a").out().as("b").dedup("a", "b") # dedup by the composition of entry `a` and `b`

Usages of the modulated by()-step:

  • propertyKey - dedup by the property value of the current object or the specified tag given the property key.

    # dedup by the property value of `name` of the current entry
    g.V().dedup().by("name")
    # dedup by the property value of `name` of the entry `a`
    g.V().as("a").out().dedup("a").by("name")
    
  • token - dedup by the token value of the current object or the specified tag.

    g.V().dedup().by(T.id)
    g.V().dedup().by(T.label)
    g.V().as("a").out().dedup("a").by(T.id)
    g.V().as("a").out().dedup("a").by(T.label)
    
  • traversal - dedup by the computed value after applying the current object or the specified tag to the nested traversal.

    g.V().dedup().by(out().count())
    g.V().as("a").out().dedup("a").by(out().count())
    

Project#

id()#

The id()-step is meant to map the graph element to its identifier.

g.V().id()

label()#

The label()-step is meant to map the graph element to its label.

g.V().label()

constant()#

The constant()-step is meant to map any object to a fixed object value.

Parameters:
value - a fixed object value.

g.V().constant(1)
g.V().constant("marko")
g.V().constant(1.0)

valueMap()#

The valueMap()-step is meant to map the graph element to a map of the property entries according to their actual properties. If no property keys are provided, then all property values are retrieved.

Parameters:
propertyKeys - the properties to retrieve.

g.V().valueMap()
g.V().valueMap("name")
g.V().valueMap("name", "age")

values()#

The values()-step is meant to map the graph element to the values of the associated properties given the provide property keys. Here we just allow only one property key as the argument to the values() to implement the step as a map instead of a flat-map, which may be a little different from the standard Gremlin.

Parameters:
propertyKey - the property to retrieve its value from.

g.V().values("name")

elementMap()#

The elementMap()-step is meant to map the graph element to a map of T.id, T.label and the property values according to the given keys. If no property keys are provided, then all property values are retrieved.


Parameters: </br>
propertyKeys - the properties to retrieve.
```bash
g.V().elementMap()
g.V().elementMap("name")
g.V().elementMap("name", "age")

select()#

The select()-step is meant to map the traverser to the object specified by the selectKey or to a map projection of sideEffect values.

Parameters:
selectKeys - the keys to project.

g.V().as("a").select("a")
g.V().as("a").out().as("b").select("a", "b")

Usages of the modulated by()-step:

  • empty - an identity() modulation.

    # = g.V().as("a").select("a")
    g.V().as("a").select("a").by()
    # = g.V().as("a").out().as("b").select("a", "b")
    g.V().as("a").out().as("b").select("a", "b").by().by()
    
  • token - project the token value of the specified tag.

    g.V().as("a").select("a").by(T.id)
    g.V().as("a").select("a").by(T.label)
    
  • propertyKey - project the property value of the specified tag given the property key.

    g.V().as("a").select("a").by("name")
    
  • traversal - project the computed value after applying the specified tag to the nested traversal.

    g.V().as("a").select("a").by(valueMap("name", "id"))
    g.V().as("a").select("a").by(out().count())
    

Aggregate#

count()#

Count the number of traverser(s) up to this point.

g.V().count()

fold()#

Rolls up objects in the stream into an aggregate list.

# select top-10 vertices from the stream and fold them into single list
g.V().limit(10).fold()

sum()#

Sum the traverser values up to this point.

g.V().values("age").sum()

min()#

Determines the minimum value in the stream.

g.V().values("age").min()

max()#

Determines the maximum value in the stream.

g.V().values("age").max()

mean()#

Compute the average value in the stream.

g.V().values("age").mean()

group()#

Organize objects in the stream into a Map. Calls to group() are typically accompanied with by() modulators which help specify how the grouping should occur.

Usages of the key by()-step:

  • empty - group the elements in the stream by the current value.

    g.V().group().by() # = g.V().group()
    
  • propertyKey - group the elements in the stream by the property value of the current object given the property key.

    g.V().group().by("name")
    
  • traversal - group the elements in the stream by the computed value after applying the current object to the nested traversal.

    g.V().group().by(values("name")) # = g.V().group().by("name")
    g.V().group().by(out().count())
    

Usages of the value by()-step:

  • empty - fold elements in each group into a list, which is a default behavior.

    g.V().group().by().by() # = g.V().group()
    
  • propertyKey - for each element in the group, get their property values according to the given keys.

    g.V().group().by().by("name")
    
  • aggregateFunc - aggregate function to apply in each group.

    g.V().group().by().by(count())
    g.V().group().by().by(fold())
    # get the property values of `name` of the vertices in each group list
    g.V().group().by().by(values("name").fold()) # = g.V().group().by().by("name")
    # sum the property values of `age` in each group
    g.V().group().by().by(values("age").sum())
    # find the minimum value of `age` in each group
    g.V().group().by().by(values("age").min())
    # find the maximum value of `age` in each group
    g.V().group().by().by(values("age").max())
    # calculate the average value of `age` in each group
    g.V().group().by().by(values("age").mean())
    # count the number of distinct elements in each group
    g.V().group().by().by(dedup().count())
    # de-duplicate in each group list
    g.V().group().by().by(dedup().fold())
    

groupCount()#

Counts the number of times a particular objects has been part of a traversal, returning a map where the object is the key and the value is the count.

Usages of the key by()-step:

  • empty - group the elements in the stream by the current value.

    g.V().groupCount().by() # = g.V().groupCount()
    
  • propertyKey - group the elements in the stream by the property value of the current object given the property key.

    g.V().groupCount().by("name")
    
  • traversal - group the elements in the stream by the computed value after applying the current object to the nested traversal.

    g.V().groupCount().by(values("name")) # = g.V().groupCount().by("name")
    g.V().groupCount().by(out().count())
    

Order

#

order()#

Order all the objects in the traversal up to this point and then emit them one-by-one in their ordered sequence.

Usages of the modulated by()-step:

  • empty - order by the current object in ascending order, which is a default behavior.

    g.V().order().by() # = g.V().order()
    
  • order - the comparator to apply typically for some order (asc | desc | shuffle).

    g.V().order().by(Order.asc) # = g.V().order()
    g.V().order().by(Order.desc)
    
  • propertyKey - order by the property value of the current object given the property key.

    g.V().order().by("name") # default order is asc
    g.V().order().by("age")
    
  • traversal - order by the computed value after applying the current object to the nested traversal.

    g.V().order().by(out().count()) # default order is asc
    
  • propertyKey - order by the property value of the current object given the property key,
    order - the comparator to apply typically for some order.

    g.V().order().by("name", Order.desc)
    
  • traversal - order by the computed value after applying the current object to the nested traversal,
    order - the comparator to apply typically for some order.

    g.V().order().by(out().count(), Order.desc)
    

Statistics#

limit()#

Filter the objects in the traversal by the number of them to pass through the stream, where only the first n objects are allowed as defined by the limit argument.

Parameters:
limit - the number at which to end the stream.

g.V().limit(10)

coin()#

Filter the object in the stream given a biased coin toss.

Parameters:
probability - the probability that the object will pass through.

g.V().coin(0.2) # range is [0.0, 1.0]
g.V().out().coin(0.2)

sample()#

Generate a certain number of sample results.

Parameters:
number - allow specified number of objects to pass through the stream.

g.V().sample(10)
g.V().out().sample(10)

Union#

union()#

Merges the results of an arbitrary number of traversals.

Parameters:
unionTraversals - the traversals to merge.

g.V().union(out(), out().out())

Match#

match()#

The match()-step provides a declarative form of graph patterns to match with. With match(), the user provides a collection of “sentences,” called patterns, that have variables defined that must hold true throughout the duration of the match(). For most of the complex graph patterns, it is usually much easier to express via match() than with single-path traversals.

Parameters:
matchSentences - define a collection of patterns. Each pattern consists of a start tag, a serials of Gremlin steps (binders) and an end tag.

Supported binders within a pattern:

  • Expand: in()/out()/both(), inE()/outE()/bothE(), inV()/outV()/otherV/bothV

  • PathExpand

  • Filter: has()/not()/where

g.V().match(__.as("a").out().as("b"), __.as("b").out().as("c"))
g.V().match(__.as("a").out().out().as("b"), where(__.as("a").out().as("b")))
g.V().match(__.as("a").out().out().as("b"), not(__.as("a").out().as("b")))
g.V().match(__.as("a").out().has("name", "marko").as("b"), __.as("b").out().as("c"))

Subgraph#

subgraph()#

An edge-induced subgraph extracted from the original graph.

Parameters:
graphName - the name of the side-effect key that will hold the subgraph.

g.E().subgraph("all")
g.V().has('name', "marko").outE("knows").subgraph("partial")

Identity#

identity()#

The identity()-step maps the current object to itself.

g.V().identity().values("id")
g.V().hasLabel("person").as("a").identity().values("id")
g.V().has("name", "marko").union(identity(), out()).values("id")

Unfold#

unfold()#

The unfold()-step unrolls an iterator, iterable or map into a linear form.

g.V().fold().unfold().values("id")
g.V().fold().as("a").unfold().values("id")
g.V().has("name", "marko").fold().as("a").select("a").unfold().values("id")
g.V().out("1..3", "knows").with('RESULT_OPT', 'ALL_V').unfold()

Syntactic Sugars#

The following steps are extended to denote more complex situations.

PathExpand#

In Graph querying, expanding a multiple-hops path from a starting point is called PathExpand, which is commonly used in graph scenarios. In addition, there are different requirements for expanding strategies in different scenarios, i.e. it is required to output a simple path or all vertices explored along the expanding path. We introduce the with()-step to configure the corresponding behaviors of the PathExpand-step.

out()#

Expand a multiple-hops path along the outgoing edges, which length is within the given range.

Parameters:
lengthRange - the lower and the upper bounds of the path length,
edgeLabels - the edge labels to traverse.

Usages of the with()-step:
keyValuePair - the options to configure the corresponding behaviors of the PathExpand-step.

# expand hops within the range of [1, 10) along the outgoing edges,
# vertices can be duplicated and only the end vertex should be kept
g.V().out("1..10").with('PATH_OPT', 'ARBITRARY').with('RESULT_OPT', 'END_V')
# expand hops within the range of [1, 10) along the outgoing edges,
# vertices and edges can be duplicated, and all vertices and edges along the path should be kept
g.V().out("1..10").with('PATH_OPT', 'ARBITRARY').with('RESULT_OPT', 'ALL_V_E')
# expand hops within the range of [1, 10) along the outgoing edges,
# vertices cannot be duplicated and all vertices should be kept
g.V().out("1..10").with('PATH_OPT', 'SIMPLE').with('RESULT_OPT', 'ALL_V')
# = g.V().out("1..10").with('PATH_OPT', 'ARBITRARY').with('RESULT_OPT', 'END_V')
g.V().out("1..10")
# expand hops within the range of [1, 10) along the outgoing edges which label is `knows`,
# vertices can be duplicated and only the end vertex should be kept
g.V().out("1..10", "knows")
# expand hops within the range of [1, 10) along the outgoing edges which label is `knows` or `created`,
# vertices can be duplicated and only the end vertex should be kept
g.V().out("1..10", "knows", "created")
# expand hops within the range of [1, 10) along the outgoing edges,
# and project the properties "id" and "name" of every vertex along the path
g.V().out("1..10").with('RESULT_OPT', 'ALL_V').values("name")

Running Example:

gremlin> g.V().out("1..3", "knows").with('RESULT_OPT', 'ALL_V')
==>[v[1], v[2]]
==>[v[1], v[4]]
gremlin> g.V().out("1..3", "knows").with('RESULT_OPT', 'ALL_V_E')
==>[v[1], e[0][1-knows->2], v[2]]
==>[v[1], e[2][1-knows->4], v[4]]
gremlin> g.V().out("1..3", "knows").with('RESULT_OPT', 'END_V').endV()
==>v[2]
==>v[4]
gremlin> g.V().out("1..3", "knows").with('RESULT_OPT', 'ALL_V').values("name")
==>[marko, vadas]
==>[marko, josh]
gremlin> g.V().out("1..3", "knows").with('RESULT_OPT', 'ALL_V').valueMap("id","name")
==>{id=[[1, 2]], name=[[marko, vadas]]}
==>{id=[[1, 4]], name=[[marko, josh]]}

in()#

Expand a multiple-hops path along the incoming edges, which length is within the given range.

g.V().in("1..10").with('PATH_OPT', 'ARBITRARY').with('RESULT_OPT', 'END_V')

Running Example:

gremlin> g.V().in("1..3", "knows").with('RESULT_OPT', 'ALL_V')
==>[v[2], v[1]]
==>[v[4], v[1]]
gremlin> g.V().in("1..3", "knows").with('RESULT_OPT', 'ALL_V_E')
==>[v[2], e[0][1-knows->2], v[1]]
==>[v[4], e[2][1-knows->4], v[1]]
gremlin> g.V().in("1..3", "knows").with('RESULT_OPT', 'END_V').endV()
==>v[1]
==>v[1]

both()#

Expand a multiple-hops path along the incident edges, which length is within the given range.

g.V().both("1..10").with('PATH_OPT', 'ARBITRARY').with('RESULT_OPT', 'END_V')

Running Example:

gremlin> g.V().both("1..3", "knows").with('RESULT_OPT', 'ALL_V')
==>[v[2], v[1]]
==>[v[1], v[2]]
==>[v[1], v[4]]
==>[v[2], v[1], v[2]]
==>[v[2], v[1], v[4]]
==>[v[4], v[1]]
==>[v[1], v[2], v[1]]
==>[v[1], v[4], v[1]]
==>[v[4], v[1], v[2]]
==>[v[4], v[1], v[4]]
gremlin> g.V().both("1..3", "knows").with('RESULT_OPT', 'ALL_V_E')
==>[v[2], e[0][1-knows->2], v[1]]
==>[v[4], e[2][1-knows->4], v[1]]
==>[v[1], e[0][1-knows->2], v[2]]
==>[v[1], e[2][1-knows->4], v[4]]
==>[v[2], e[0][1-knows->2], v[1], e[0][1-knows->2], v[2]]
==>[v[2], e[0][1-knows->2], v[1], e[2][1-knows->4], v[4]]
==>[v[4], e[2][1-knows->4], v[1], e[0][1-knows->2], v[2]]
==>[v[4], e[2][1-knows->4], v[1], e[2][1-knows->4], v[4]]
==>[v[1], e[0][1-knows->2], v[2], e[0][1-knows->2], v[1]]
==>[v[1], e[2][1-knows->4], v[4], e[2][1-knows->4], v[1]]
gremlin> g.V().both("1..3", "knows").with('RESULT_OPT', 'END_V').endV()
==>v[1]
==>v[1]
==>v[2]
==>v[4]
==>v[2]
==>v[1]
==>v[1]
==>v[4]
==>v[2]
==>v[4]

endV()#

By default, all kept vertices are stored in a path collection which can be unfolded by a endV()-step.

# a path collection containing the vertices within [1, 10) hops
g.V().out("1..10").with('RESULT_OPT', 'ALL_V')
# unfold vertices in the path collection
g.V().out("1..10").with('RESULT_OPT', 'ALL_V').endV()

Expression#

Expressions, expressed via the expr() syntactic sugar, have been introduced to facilitate writing expressions directly within steps such as select(), project(), where(), and group(). This update is part of an ongoing effort to standardize Gremlin’s expression syntax, making it more aligned with SQL expression syntax. The updated syntax, effective from version 0.27.0, streamlines user operations and enhances readability. Below, we detail the updated syntax definitions and point out key distinctions from the syntax used prior to version 0.26.0.

Literal:

Category

Syntax

string

“marko”

boolean

true, false

integer

1, 2, 3

long

1l, 1L

float

1.0f, 1.0F

double

1.0, 1.0d, 1.0D

list

[“marko”, “vadas”], [true, false], [1, 2], [1L, 2L], [1.0F, 2.0F], [1.0, 2.0]

Variable:

Category

Description

Before 0.26.0

Since 0.27.0

current

the current entry

@

_

current property

the property value of the current entry

@.name

_.name

tag

the specified tag

@a

a

tag property

the property value of the specified tag

@a.name

a.name

Operator:

Category

Operation (Case-Insensitive)

Description

Before 0.26.0

Since 0.27.0

logical

=

equal

@.name == “marko”

_.name = “marko”

logical

<>

not equal

@.name != “marko”

_.name != “marko”

logical

>

greater than

@.age > 10

_.age > 10

logical

<

less than

@.age < 10

_.age < 10

logical

>=

greater than or equal

@.age >= 10

_.age >= 10

logical

<=

less than or equal

@.age <= 10

_.age <= 10

logical

NOT

negate the logical expression

! (@.name == “marko”)

NOT _.name = “marko”

logical

AND

connect two logical expressions with AND

@.name == “marko” && @.age > 10

_.name = “marko” AND _.age > 10

logical

OR

connect two logical expressions with OR

@.name == “marko” || @.age > 10

_.name = “marko” OR _.age > 10

logical

IN

whether the value of the current entry is in the given list

@.name WITHIN [“marko”, “vadas”]

_.name IN [“marko”, “vadas”]

logical

IS NULL

whether the value of the current entry ISNULL

@.age IS NULL

_.age IS NULL

logical

IS NOT NULL

whether the value of the current entry IS NOT NULL

! (@.age ISNULL)

_.age IS NOT NULL

arithmetical

+

addition

@.age + 10

_.age + 10

arithmetical

-

subtraction

@.age - 10

_.age - 10

arithmetical

*

multiplication

@.age * 10

_.age * 10

arithmetical

/

division

@.age / 10

_.age / 10

arithmetical

%

modulo

@.age % 10

_.age % 10

arithmetical

POWER

exponentiation

@.age ^^ 3

POWER(_.age, 3)

bitwise

&

bitwise AND

@.age & 2

_.age & 2

bitwise

|

bitwise OR

@.age | 2

_.age | 2

bitwise

^

bitwise XOR

@.age ^ 2

_.age ^ 2

bit shift

<<

left shift

@.age << 2

_.age << 2

bit shift

>>

right shift

@.age >> 2

_.age >> 2

string regex match

STARTS WITH

whether the string starts with the given prefix

@.name STARTSWITH “ma”

_.name STARTS WITH “ma”

string regex match

NOT STARTS WITH

whether the string does not start with the given prefix

! (@.name STARTSWITH “ma”)

NOT _.name STARTS WITH “ma”

string regex match

ENDS WITH

whether the string ends with the given suffix

@.name ENDSWITH “ko”

_.name ENDS WITH “ko”

string regex match

NOT ENDS WITH

whether the string does not end with the given suffix

! (@.name ENDSWITH “ko”)

NOT _.name ENDS WITH “ko”

string regex match

CONTAINS

whether the string contains the given substring

“ar” WITHIN @.name

_.name CONTAINS “ar”

string regex match

NOT CONTAINS

whether the string does not contain the given substring

“ar” WITHOUT @.name

NOT _.name CONTAINS “ar”

Function:

Category

Function (Case-Insensitive)

Description

Before 0.26.0

Since 0.27.0

aggregate

COUNT

count the number of the elements

unsupported

COUNT(_.age)

aggregate

SUM

sum the values of the elements

unsupported

SUM(_.age)

aggregate

MIN

find the minimum value of the elements

unsupported

MIN(_.age)

aggregate

MAX

find the maximum value of the elements

unsupported

MAX(_.age)

aggregate

AVG

calculate the average value of the elements

unsupported

AVG(_.age)

aggregate

COLLECT

fold the elements into a list

unsupported

COLLECT(_.age)

aggregate

HEAD(COLLECT())

find the first value of the elements

unsupported

HEAD(COLLECT(_.age))

other

LABELS

get the labels of the specified tag which is a vertex

@a.~label

LABELS(a)

other

TYPE

get the type of the specified tag which is an edge

@a.~label

TYPE(a)

other

LENGTH

get the length of the specified tag which is a path

@a.~len

LENGTH(a)

Expression in project or filter:

Category

Description

Before 0.26.0

Since 0.27.0

filter

filter the current traverser by the expression

where(expr(“@.name == \“marko\””))

where(expr(_.name = “marko”))

project

project the current traverser to the value of the expression

select(expr(“@.name”))

select(expr(_.name))

Here we provide the precedence of the operators mentioned above, which is also based on the SQL standard.

Precedence

Operator

Description

Associativity

1

(), ., power(), count()…

Parentheses, Member access, Function call

Left-to-right

2

-a, +a

Unary minus, Unary plus

Right-to-left

3

*, /, %

Multiplication, Division, Modulus

Left-to-right

4

+, -, &, |, ^, <<, >>

Addition, Subtraction, Bitwise AND, Bitwise OR, Bitwise XOR, Left shift, Right shift

Left-to-right

5

STARTS WITH, ENDS WITH, CONTAINS, IN

String regex match, Collection membership

Left-to-right

6

=, <>, <, <=, >, >=

Comparison

Left-to-right

7

IS NULL, IS NOT NULL

Nullness check

Left-to-right

8

NOT

Logical NOT

Right-to-left

9

AND

Logical AND

Left-to-right

10

OR

Logical OR

Left-to-right

Running Examples#

gremlin> :submit g.V().where(expr(_.name = "marko"))
==>v[1]
gremlin> :submit g.V().as("a").where(expr(a.name = "marko" OR a.age > 10))
==>v[6]
==>v[1]
==>v[2]
==>v[4]
gremlin> :submit g.V().as("a").where(expr(a.age IS NULL)).values("name")
==>lop
==>ripple
gremlin> :submit g.V().as("a").where(expr(a.age IS NOT NULL)).values("name")
==>vadas
==>josh
==>marko
==>peter
gremlin> :submit g.V().as("a").where(expr(a.name STARTS WITH "ma"))
==>v[1]
gremlin> :submit g.V().select(expr(_.name))
==>vadas
==>josh
==>lop
==>ripple
==>marko
==>peter
gremlin> :submit g.V().hasLabel("person").select(expr(_.age ^ 1))
==>26
==>28
==>33
==>34
gremlin> :submit g.V().hasLabel("person").select(expr(POWER(_.age, 2)))
==>729
==>1024
==>1225
==>841

Aggregate (Group)#

The group()-step in standard Gremlin has limited capabilities (i.e. grouping can only be performed based on a single key, and only one aggregate calculation can be applied in each group), which cannot be applied to the requirements of performing group calculations on multiple keys or values; Therefore, we further extend the capabilities of the group()-step, allowing multiple variables to be set and different aliases to be configured in key by()-step and value by()-step respectively.

Usages of the key by()-step:

# group by the property values of `name` and `age` of the current entry
group().by(values("name").as("k1"), values("age").as("k2"))
# group by the count of one-hop neighbors and the property value of `age` of the current entry
group().by(out().count().as("k1"), values("name").as("k2"))

Usages of the value by()-step:

# calculate the count of vertices and the sum of `age` respectively in each group
group().by("name").by(count().as("v1"), values("age").sum().as("v2"))

Running Example:

gremlin> g.V().hasLabel("person").group().by(values("name").as("k1"), values("age").as("k2"))
==>{[josh, 32]=[v[4]], [vadas, 27]=[v[2]], [peter, 35]=[v[6]], [marko, 29]=[v[1]]}
gremlin> g.V().hasLabel("person").group().by(out().count().as("k1"), values("name").as("k2"))
==>{[2, josh]=[v[4]], [0, vadas]=[v[2]], [3, marko]=[v[1]], [1, peter]=[v[6]]}
gremlin> g.V().hasLabel("person").group().by("name").by(count().as("v1"), values("age").sum().as("v2"))
==>{marko=[1, 29], peter=[1, 35], josh=[1, 32], vadas=[1, 27]}
gremlin> g.V().hasLabel("person").group().by("name").by(count().as("v1"), values("age").sum().as("v2")).select("v1", "v2")
==>{v1=1, v2=35}
==>{v1=1, v2=32}
==>{v1=1, v2=27}
==>{v1=1, v2=29}

Limitations#

Here we list steps which are unsupported yet. Some will be supported in the near future while others will remain unsupported for some reasons.

To be Supported#

The following steps will be supported in the near future.

path()#

Map the traverser to its path history.

g.V().out().out().path()
g.V().as("a").out().out().select("a").by("name").path()

unfold()#

Unrolls a iterator, iterable or map into a linear form.

g.V().fold().unfold()

local()#

g.V().fold().count(local)
g.V().values('age').fold().sum(local)

Will Not be Supported#

The following steps will remain unsupported.

repeat()#

  • repeat().times()
    In graph pattern scenarios, repeat().times() can be replaced equivalently by the PathExpand-step.

    # = g.V().out("2..3", "knows").endV()
    g.V().repeat(out("knows")).times(2)
    
    # = g.V().out("1..3", "knows").endV()
    g.V().repeat(out("knows")).emit().times(2)
    
    # = g.V().out("2..3", "knows").with('PATH_OPT', 'SIMPLE').endV()
    g.V().repeat(out("knows").simplePath()).times(2)
    
    # = g.V().out("1..3", "knows").with('PATH_OPT', 'SIMPLE').endV()
    g.V().repeat(out("knows").simplePath()).emit().times(2)
    
  • repeat().until()
    It is a imperative syntax, not declarative.

properties()#

The properties()-step retrieves and then unfolds properties from a graph element. The valueMap()-step can reflect all the properties of each graph element in a map form, which could be much more clear than the results of the properties()-step for the latter could mix up the properties of all the graph elements in the same output.

sideEffect#

It is required to maintain global variables for SideEffect-step during actual execution, which is hard to implement in distributed scenarios. i.e.

  • group(“a”)

  • groupCount(“a”)

  • aggregate(“a”)

  • sack()

branch#

Currently, we only support the operations of merging multiple streams into one. The following splitting operations are unsupported:

  • branch()

  • choose()