GraphScope's Journey on
Graph Processing

Background

Exponential Data Growth at Alibaba

It was nearly 10 years ago, business was expanding exponentially and, therefore, data was growing rapidly. Many of them naturally could be modeled as graphs, and the need for graph computing emerged in many areas, such as

E-commerce transactions
Navigation routes and POIs
Video creator ratings

Early Adoptions

Initial graph problems were addressed in isolation.

2013 ~ 2020

Getting Insight of Graph Data

Graph Traversal for Exploration

Graph traversal - a process of visiting the nodes in a graph - is a key primitive in many online and interactive graph applications, e.g.,

De-facto

Gremlin-based Graph Traversal

Gremlin is the de-facto standard language that allows high-level and declarative programming for various graph operations.

Example

Cycle detection: to find a cycle occurs in a graph where a path loops back on itself to the originating vertex.


g.V().has('name','tom').as('a').repeat(out().simplePath())
 .times(LENGTH).where(out().as('a')).path()

MaxGraph and GAIA,
Systems Scaling Gremlin Queries

Architecture of MaxGraph. In which, GAIA is an optimization component for both compilation and execution based on MaxGraph.

Example

`Gremlin` Query Compilation

As shown in the cycle-detection example, a Gremlin query can be arbitrary composition of iterative and nested operations.

g.V().has('firstname','Tom').as('a')
  .repeat(out().simplePath()).times(k)
  .where(out().eq('a')).path()

The gremlin query is compiled to a dataflow plan, some operators are assigned SCOPE with context for parallel processing

Graph Analytics,
Whole Graph Computation

Example

Entity Resolution: Identify and link different representations of the same real-world entity. It is nontrivial, challenges are:

Iterative and intensive computation
Large scale and heterogeneous data, dynamic changes (PBs of data with billions of records across hundreds data sources with TBs of update daily)
Multi-domain connections/relations between possible entities
Noises and errors, incomplete/missing data

Previous solutions?

Limitations of the Vertex-Centric Model

Paradigm of Pregel and GAS, representatives of "Think-like-a-vertex" model.

We used to work on an in-house vertex-centric graph system ODPSGraph, to parallelize the entity resolution. However, ever-growing challenges were emerged over the years.

Difficult to program, sometimes requiring a complete redesign algorithm.
Constant struggles over ad-hoc trade-offs. For many algorithms, to parallelize them efficiently in vertex-centric model often means losing precision or quality.
Time-consuming and huge cost. It often took several senior engineers months of effort to bring an algorithm online.
Subpar performance.

Alternative to Vertex-Centric?

We answer YES!

PIE Model and GRAPE System

We presented PIE and GRAPE on SIGMOD'2017, and open sourced it at https://github.com/alibaba/libgrape-lite

Given a query Q and a graph G, to compute Q(G), users only need to provide 3 functions.

PEval: a (sequential) algorithm for Q, for partial evaluation
IncEval: a (sequential) incremental algorithm for Q
Assemble: a function often just taking a union of partial results.

SIGMOD'2017

Best Paper Award

VLDB'2017

Best Demo Award

SIGMOD'2018

Research Highlight

In 2018

GNN Emerges

Example

GNN-based Recommendation

Graph-Learn, a GNN Framework

Presented on VLDB'2019, and open sourced at
https://github.com/alibaba/graph-learn
It has been successfully applied to many scenarios inside and outside Alibaba.

Specialized Graph Pattern Matching, Mining...

Specialized graph applications were also widely adopted. We list a few of our studies...

[VLDB'2018] Real-time Constrained Cycle Detection in Large Dynamic Graphs.
[VLDB'2020] Maximum Biclique Search at Billion Scale.
...

VLDB'2020

Best Paper (Runner-up)

A Problem.. Graph Systems Became Silos

Consolidating our efforts

The Emergence of GraphScope

2018 ~ 2024

Real-world Graph Applications Often Involve Multiple Types of Graph Computations

Example

A simplified workflow for fraud-detection in E-commerce platform:

Construct a property graph from raw data using SQL;
Extract a subgraph using Gremlin;
A label-propagation algorithm for identifying fraudulent entities;
Graph sampling to conduct k-hop sampling by weight;
Train a GNN model using TensorFlow

GraphScope: A Unified Engine for Big Graph Processing

We presented GraphScope on VLDB'2021, and open sourced at
https://github.com/alibaba/graphscope

A simple and unified programming interface;
A distributed dataflow runtime that enables a separate optimization for each graph operation in one carefully designed coherent framework.
An in-memory data store that automatically manages the representation, transformation, and movement of intermediate data.
Embracing Python, which allowing us to seamlessly combine GraphScope with other data processing systems

Why Python?

Interactive computing, WYSIWYG

Jupyter Notebook
Analysis, model, extract, visualize…

Rich ecosystem, end-2-end, 1-stop

Ability to process various data and tasks, e.g.,

Json, text, tensor, SQL, image, video…
scientific/graph computing, ML…

High-level data and operation abstraction

Tensor、Dataframe、Graph、Scalar、Objects …
Operations defined on high-level

Ease of use: NetworkX compatible

pip install graphscope

Compatible graph operations and algorithms API with NetworkX

Example

How to interoperate with others PyData systems?

Vineyard: In-memory Immutable Data Management

We presented Vineyard on SIGMOD'2023, and open sourced it at
https://github.com/v6d-io/v6d, Vineyard is a CNCF Sandbox Project.

Why do we need vineyard?

many big data systems work with (partitioned) immutable in-memory data, however,
it takes huge effort just to implement I/O adaptors, data partition/chunking strategies, fault-tolerance mechanisms, data access RPC, scale in/out.
sharing intermediate results between systems relies on external FS, although I/O cost is usually very high.
no efficient way to dynamically sync immutable data with external mutable data sources.

Revisit the case study with Vineyard

Sharing Data across PyData ecosystem

Vineyard provides:

Out-of-box high-level abstraction (e.g. Edge/Vertex Iterator over graphs instead of k/v lookup) and mid-to low-level APIs for advanced developers to tailor special needs.
zero-copy local data access through shared memory
Transparent (or very similar to accessing local data) remote data access support wide range of data sources/formats, data partition strategies
(fast) sync with mutable data sources
Fault-tolerant
...

Are we done yet?

Next Iteration

GraphScope Flex
The Next Stage of GraphScope's Evolution

2023 ~ Current

Practical challenges of GraphScope

A One-Size-Fits-All Approach is Inadequate

Example

The figure illustrate a simplified view of the graph system in real-life world. It features

Various business scenarios
Diverse graph workloads
Many graphs in different formats

The variety comes from

1. Various Graph Models and Organizations

Even a single dataset can be modeled in different ways, depending on its specific needs.

Cited from SIGMOD'2024 Panel: The Future of Graph Analytics

2. Diverse Graph Computing Workloads

and Programming Interfaces

For graph querying

Cypher
Gremlin
ISO/GQL

For graph analytics

Think-like-a-vertex: Pregel, Gather-Apply-Scatter(GAS) …
Think-like-a-graph: Pregel++, Blogel, PIE…
Other DSL: FLASH, GraphIt…

For graph learning

Mostly in Python
PyG, DGL, …

3. Performance Considerations

High request throughput? vs. low latency with data-parallel execution?
In-memory vs. out-of-core?
Efficient processing vs. resources utilization
High availability?
…

The diverse landscape of graph computing motivated GraphScope

Evolving to GraphScope Flex

Employs a LEGO-like modularity;
Comprises many components, each like a LEGO brick;
A CLI utility flexbuild, to select components and build artifacts for deployments.

Architecture of GraphScope Flex

How to deal with the diversity of graph storage?

Understanding the Complexity of Graph Storage Abstraction

Graph storages can be diverse. The requirements of computing engine accessing the data are different as well.

Diversed feature requests of graph storage.

GRIN: Unified Graph Retrieval Interface

Open sourced at https://github.com/graphscope/GRIN

GRIN is a proposed standard graph retrieval interface in GraphScope. The goal is to simplify the integrations between different computing engines and storage engines from M * N to M + N.

Storages without/with GRIN.

One of the storage backends supports GRIN

GraphAr: An Open Source File Format for Archiving and Exchanging Graph Data

Open sourced as an Apache Incubating Project
https://github.com/apache/GraphAr

GraphAr (short for“Graph Archive”) is a project that aims to make it easier for diverse applications and systems (in-memory and out-of-core storages, databases, graph computing systems, and interactive graph query frameworks) to build and access graph data conveniently and efficiently

Data sharing without/with GraphAr.

For graph querying

Querying Stack in GraphScope Flex

Supports both Gremlin and Cypher in front-ends;
Provides a unified Intermediate Representation(IR) Abstraction, parsing Gremlin/Cypher… to unified IR;
And IR based Optimizer, and two engines,
Gaia, using a dataflow model, for OLAP-like jobs;
Hiactor, using the actor model, for OLTP-like jobs.

Example

Real-life Application: Real-time Fraud Detection

Problem: To identify suspicious transactions in e-commerce by checking each order against known frauds.

Fraud detection and its Cypher solution.

The problem can be tackled by a deployment of GraphScope Flex with these bricks.

Performance

No.1 in LDBC SNB Benchmark

For graph analytics

Analytical Stack in GraphScope Flex

Supports multiple interfaces
- Python, NetworkX compatible APIs
- Java SDK, Pregel(Giraph)/GraphX compatible APIs
- C++ SDK, GRAPE API
100+ Built-in algorithms, out-of-the-box ready to use.
- PageRank, Centralities…
- Reachability and shortest paths…
- Community detection, LPA, Louvain…
At its core, is the GRAPE engine.
- Supports distributed graph computing engine with auto-parallelization
- Integrated Ingress for auto-incrementalization
- Integrated FLASH model with great expressive capability
- Supports GPU acceleration

Example

Real-life Application: Equity Analysis

Problem: To identify the dominant shareholders responsible for steering a company, i.e., holds more than 51% shares.

Person C holds more than 51% of Company 1, via Company 2, (0.8*0.6 = 0.48) and Company 3, (0.8*0.3*0.7 = 0.168).

This problem is tackled by the GraphScope Flex analytical stack, with an analytical algorithm implemented based on label propagation..

Interested in more details?

GraphScope Flex
Technical Preview is Available!

https://github.com/alibaba/graphscope

Future Works

Enhancing the core capabilities
- GQL support
- New storage backends
- GraphAr with Data Lakes
- HTAP processing on graphs
- Applications in real-life scenarios
- ...
Inter-operations with other systems
- Graph-specific ETL to streamline the integration of applications across different graph models derived from the same dataset.
- In scenarios blending graph tasks with SQL-like operations, a unified compiler across multiple engines can significantly enhance work- flow interoperability and expand the scope of graph computations.
- ...

Our ultimate goal

Make Graph Computing Easy and Available for Everyone!

Welcome join forces with us!

Github

Blog

Papers

Playground

References

GraphScope Flex: LEGO-like Graph Computing Stack, SIGMOD2024, Tao He, Shuxian Hu, Longbin Lai, Dongze Li, Neng Li, Xue Li, Lexiao Liu, Xiaojian Luo, Binqing Lyu, Ke Meng, Sijie Shen, Li Su, Lei Wang, Jingbo Xu, Wenyuan Yu, Weibin Zeng, Lei Zhang, Siyuan Zhang, Jingren Zhou, Xiaoli Zhou, Diwen Zhu.
Dynamic Graph Sampling Service for Real-time GNN Inference at Scale, EuroSys 2023. Jie Sun, Li Su, Wenting Shen, Zichao Zhang, Zuocheng Shi, Jingbo Xu, Yong Li, Wenyuan Yu, Zeke Wang, Fei Wu, Jingren Zhou.
Bridging the Gap between Relational OLTP and Graph-based OLAP, USENIX ATC 2023. Sijie Shen, Zihang Yao, Lin Shi, Lei Wang, Longbin Lai, Qian Tao, Li Su, Rong Chen, Wenyuan Yu, Haibo Chen, Binyu Zang, Jingren Zhou.
GLogS: Interactive Graph Pattern Matching Query At Large Scale, USENIX ATC 2023. Longbin Lai, Yufan Yang, Zhibin Wang, Yuxuan Liu, Haotian Ma, Sijie Shen, Bingqing Lyu, Xiaoli Zhou, Wenyuan Yu, Zhengping Qian, Chen Tian, Sheng Zhong, Yeh-Ching Chung, Jingren Zhou.
Legion: Automatically Pushing the Envelope of Multi-GPU System for Billion-Scale GNN Training, USENIX ATC 2023. Jie Sun, Li Su, Zuocheng Shi, Wenting Shen, Zeke Wang, Lei Wang, Jie Zhang, Wenyuan Yu, Yong Li, Jingren Zhou, Fei Wu.
Vineyard: Optimizing Data Sharing in Data-Intensive Analytics, SIGMOD 2023. Wenyuan Yu, Tao He, Lei Wang, Ke Meng, Ye Cao, Diwen Zhu, Sanhong Li, Jingren Zhou.
Efficient Multi-GPU Graph Processing with Remote Work Stealing, ICDE 2023. Ke Meng, Liang Geng, Xue Li, Qian Tao, Wenyuan Yu, Jingren Zhou.
FLASH: A Framework for Programming Distributed Graph Processing Algorithms, ICDE 2023. Xue Li, Ke Meng, Lu Qin, Longbin Lai, Wenyuan Yu, Zhengping Qian, Xuemin Lin, Jingren Zhou.
GNNLab: A Factored System for Sample-based GNN Training over GPUs, EuroSys 2022. Jianbang Yang, Dahai Tang, Xiaoniu Song, Lei Wang, Qiang Yin, Rong Chen, Wenyuan Yu, and Jingren Zhou.
GraphScope: A Unified Engine For Big Graph Processing, VLDB 2021. Wenfei Fan, Tao He, Longbin Lai, Xue Li, Yong Li, Zhao Li, Zhengping Qian, Chao Tian, Lei Wang, Jingbo Xu, Youyang Yao, Qiang Yin, Wenyuan Yu, Jingren Zhou, Diwen Zhu, and Rong Zhu.
GraphScope: A One-Stop Large Graph Processing System. VLDB, demo, 2021. Jingbo Xu, Zhanning Bai, Wenfei Fan, Longbin Lai, Xue Li, Zhao Li, Zhengping Qian, Lei Wang, Yanyan Wang, Wenyuan Yu, and Jingren Zhou.
Automating Incremental Graph Processing with Flexible Memoization. VLDB 2021. Shufeng Gong, Chao Tian, Qiang Yin, Wenyuan Yu, Yanfeng Zhang, Liang Geng, Song Yu, Ge Yu, and Jingren Zhou.
GAIA: A System for Interactive Analysis on Distributed Graphs Using a High-Level Language, NSDI 2021. Zhengping Qian, Chenqiang Min, Longbin Lai, Yong Fang, Gaofeng Li, Youyang Yao, Bingqing Lyu, Xiaoli Zhou, Zhimin Chen, and Jingren Zhou.
FlexGraph: A Flexible and Efficient Distributed Framework for GNN Training. EuroSys 2021. Lei Wang, Qiang Yin, Chao Tian, Jianbang Yang, Rong Chen, Wenyuan Yu, Zihang Yao, and Jingren Zhou.
Maximum Biclique Search at Billion Scale, VLDB 2020. Best Paper Runner-up Bingqing Lyu, Lu Qin, Xuemin Lin, Ying Zhang, Zhengping Qian, and Jingren Zhou.
Adaptive Asynchronous Parallelization of Graph Algorithms. TODS. 45(2): 6:1-6:45, 2020. Wenfei Fan, Ping Lu, Wenyuan Yu, Jingbo Xu, Qiang Yin, Xiaojian Luo, Jingren Zhou, and Ruochun Jin.
Parallelizing Sequential Graph Computations TODS 43(4): 18:1-18:39, 2018. Wenfei Fan, Wenyuan Yu, Jingbo Xu, Jingren Zhou, Xiaojian Luo, Qiang Yin, Ping Lu, Yang Cao and Ruiqi Xu
Real-time Constrained Cycle Detection in Large Dynamic Graphs. VLDB, 2018. Xiafei Qiu, Wubin Cen, Zhengping Qian, You Peng, Ying Zhang, Xuemin Lin, and Jingren Zhou.
GRAPE: Parallelizing Sequential Graph Computations. VLDB, demo, 2017. Wenfei Fan, Jingbo Xu , Yinghui Wu, Wenyuan Yu, Jiaxin Jiang

Copyrights

Some images are designed by Freepik (https://freepik.com).
Some fonts and icons from FontAwesome (https://fontawesome.com).

GraphScope's Journey on Graph Processing

Background

Exponential Data Growth at Alibaba

Early Adoptions

Initial graph problems were addressed in isolation.

2013 ~ 2020

Getting Insight of Graph Data

Graph Traversal for Exploration

De-facto

Gremlin-based Graph Traversal

Example

MaxGraph and GAIA, Systems Scaling Gremlin Queries

Example

Gremlin Query Compilation

Graph Analytics, Whole Graph Computation

Example

Previous solutions?

Limitations of the Vertex-Centric Model

Alternative to Vertex-Centric?

We answer YES!

PIE Model and GRAPE System

In 2018

GNN Emerges

Example

Graph-Learn, a GNN Framework

Specialized Graph Pattern Matching, Mining...

A Problem.. Graph Systems Became Silos

Consolidating our efforts

The Emergence of GraphScope

2018 ~ 2024

Real-world Graph Applications Often Involve Multiple Types of Graph Computations

Example

GraphScope: A Unified Engine for Big Graph Processing

Why Python?

Interactive computing, WYSIWYG

Rich ecosystem, end-2-end, 1-stop

High-level data and operation abstraction

Ease of use: NetworkX compatible

Example

How to interoperate with others PyData systems?

Vineyard: In-memory Immutable Data Management

Revisit the case study with Vineyard

Sharing Data across PyData ecosystem

Are we done yet?

Next Iteration

GraphScope Flex The Next Stage of GraphScope's Evolution

2023 ~ Current

Practical challenges of GraphScope

A One-Size-Fits-All Approach is Inadequate

Example

The variety comes from

1. Various Graph Models and Organizations

2. Diverse Graph Computing Workloads

and Programming Interfaces

3. Performance Considerations

The diverse landscape of graph computing motivated GraphScope

Evolving to GraphScope Flex

How to deal with the diversity of graph storage?

Understanding the Complexity of Graph Storage Abstraction

GRIN: Unified Graph Retrieval Interface

One of the storage backends supports GRIN

GraphAr: An Open Source File Format for Archiving and Exchanging Graph Data

For graph querying

Querying Stack in GraphScope Flex

Example

Real-life Application: Real-time Fraud Detection

Performance

No.1 in LDBC SNB Benchmark

For graph analytics

Analytical Stack in GraphScope Flex

Example

Real-life Application: Equity Analysis

Interested in more details?

GraphScope FlexTechnical Preview is Available!

Future Works

Our ultimate goal

Make Graph Computing Easy and Available for Everyone!

Read More and Useful Links

Github

Blog

GraphScope's Journey on
Graph Processing

MaxGraph and GAIA,
Systems Scaling Gremlin Queries

`Gremlin` Query Compilation

Graph Analytics,
Whole Graph Computation

GraphScope Flex
The Next Stage of GraphScope's Evolution

GraphScope Flex
Technical Preview is Available!