It was nearly 10 years ago, business was expanding exponentially and, therefore, data was growing rapidly. Many of them naturally could be modeled as graphs, and the need for graph computing emerged in many areas, such as
Graph traversal - a process of visiting the nodes in a graph - is a key primitive in many online and interactive graph applications, e.g.,
Gremlin is the de-facto standard language that allows high-level and declarative programming for various graph operations.
Cycle detection: to find a cycle occurs in a graph where a path loops back on itself to the originating vertex.
g.V().has('name','tom').as('a').repeat(out().simplePath()) .times(LENGTH).where(out().as('a')).path()
Gremlin
Query CompilationAs shown in the cycle-detection example, a Gremlin query can be arbitrary composition of iterative and nested operations.
g.V().has('firstname','Tom').as('a') .repeat(out().simplePath()).times(k) .where(out().eq('a')).path()
Entity Resolution: Identify and link different representations of the same real-world entity. It is nontrivial, challenges are:
We used to work on an in-house vertex-centric graph system ODPSGraph, to parallelize the entity resolution. However, ever-growing challenges were emerged over the years.
We presented PIE and GRAPE on SIGMOD'2017, and open sourced it at https://github.com/alibaba/libgrape-lite
Given a query Q and a graph G, to compute Q(G), users only need to provide 3 functions.
SIGMOD'2017
Best Paper Award
VLDB'2017
Best Demo Award
SIGMOD'2018
Research Highlight
GNN-based Recommendation
Presented on VLDB'2019, and open sourced at
https://github.com/alibaba/graph-learn
It has been successfully applied to many scenarios inside and outside Alibaba.
Specialized graph applications were also widely adopted. We list a few of our studies...
VLDB'2020
Best Paper (Runner-up)
A simplified workflow for fraud-detection in E-commerce platform:
We presented GraphScope on VLDB'2021, and open sourced at
https://github.com/alibaba/graphscope
pip install graphscope
Compatible graph operations and algorithms API with NetworkX
We presented Vineyard on SIGMOD'2023, and open sourced it at
https://github.com/v6d-io/v6d, Vineyard is a CNCF Sandbox Project.
Why do we need vineyard?
Vineyard provides:
The figure illustrate a simplified view of the graph system in real-life world. It features
Even a single dataset can be modeled in different ways, depending on its specific needs.
For graph querying
For graph analytics
For graph learning
Graph storages can be diverse. The requirements of computing engine accessing the data are different as well.
Open sourced at https://github.com/graphscope/GRIN
GRIN is a proposed standard graph retrieval interface in GraphScope. The goal is to simplify the integrations between different computing engines and storage engines from M * N to M + N.
Open sourced as an Apache Incubating Project
https://github.com/apache/GraphAr
GraphAr (short for“Graph Archive”) is a project that aims to make it easier for diverse applications and systems (in-memory and out-of-core storages, databases, graph computing systems, and interactive graph query frameworks) to build and access graph data conveniently and efficiently
Problem: To identify suspicious transactions in e-commerce by checking each order against known frauds.
The problem can be tackled by a deployment of GraphScope Flex with these bricks.
Problem: To identify the dominant shareholders responsible for steering a company, i.e., holds more than 51% shares.
This problem is tackled by the GraphScope Flex analytical stack, with an analytical algorithm implemented based on label propagation..
Welcome join forces with us!
References
Copyrights