Graph Transformations¶
We introduce a series of method that can append more labels to a existed grpah, and do projection over existed graph. We will also show how to make a complex property graph compatible with algorithms that can only run on simple graph. Finally, we show how to add the query result of algorithm back to graph as a property on vertex.
More specically, Graph
provides two methods for append labels, and one method for
projection.
def add_vertices(self, vertices, label="_", properties=[], vid_field=0):
pass
def add_edges(self, edges, label="_", properties=[], src_label=None, dst_label=None, src_field=0, dst_field=1):
pass
def project(self, vertices, edges):
pass
We have already seem add_vertices and add_edges in loading graphs, we use them to build a graph iteratively.
Further, we can use them to attach more vertex labels and edge labels to a existed graph. But this won’t modify the source graph, instead, it will return a new graph, which is based on the source graph.
Attach new labels¶
Take LDBC-SNB Property Graph as an example,We now load a subset of labels, as the source graph.
import graphscope
from pathlib import Path
from graphscope.framework.loader import Loader
sess = graphscope.session()
graph = sess.g(directed=directed)
graph = graph.add_vertices(Loader("person_0_0.csv", delimiter="|"), "person")
graph = graph.add_edges(Loader("person_knows_person_0_0.csv", delimiter="|"),
"knows", src_label="person", dst_label="person"
)
# graph has 1 vertex label "person"
print(graph.schema)
Now we have an loaded graph, let’s attach some new labels to it.
graph1 = graph.add_vertices(Loader("comment_0_0.csv", delimiter="|"), "comment")
# Now graph1 has 2 vertex labels "person" and "comment"
print(graph1.schema)
graph2 = graph1.add_edges(Loader("comment_replyOf_comment_0_0.csv", delimiter="|"),
"replyOf", src_label="comment", dst_label="comment"
)
# graph2 has 2 edge labels "knows" and "replyOf"
print(graph2.schema)
We can see each operation of add will produce a new graph. In implementation detail, their common labels will share the common memory, so it won’t copy the source graph.
Projection¶
In some scenario, we need to extract a subgraph from a complex graph. We do that by project.
def project(
self,
vertices: Mapping[str, Union[List[str], None]],
edges: Union[Mapping[str, Union[List[str], None]], None]
):
pass
The parameter definition means it’s a dict, the key is the label name, the value is a list of str, which is the name of properties. Specifically, if the value is None, it means select all properties.
A graph that produced by project should just like a normal property graph, and can be projected further.
Here’s some examples.
sub_graph = graph2.project(vertices={"person": ["firstName", "lastName"]}, edges={"knows": None})
# contains 1 vertex label "person", and 1 edge label "knows", with selected properties.
print(sub_graph.schema)
sub_graph2 = sub_graph.project(vertices={"person": []}, edges={"knows": ["creationDate"]})
# No properties on the vertex, and 1 property on the edge.
print(sub_graph2.schema)
Transform to simple graph implicitly¶
When an algorithm that only works on simple graph query a property graph, the property graph will be converted to a simple graph implicitly. If such transformation cannot be performed (the vertex label num and edge label num is not one, or has more than 1 property on vertex/edge), an exception will be raised.
from graphscope import wcc
ret = wcc(sub_graph2)
# wcc(graph2) # Error! More than 1 vertex label / edge label
# wcc(sub_graph) # Error! More than 1 property.
Add results back to graph as a property¶
The result ret produced in previous step can be add to a graph as a property of vertex.
Note the result can not only be added to the graph it directly queried on, but also the graph which produced the queried graph by project, as long as the vertex label that will be mutated is the same between the two graphs.
new_graph = sub_graph2.add_column(ret, selector={'cc': 'r'})
new_graph = sub_graph.add_column(ret, selector={'cc': 'r'})
new_graph = graph.add_column(ret, selector={'cc': 'r'})