How to Run and Develop GraphScope Locally
In this post, we will detail two ways to install GraphScope locally: 1) directly install the published binary package through pip; 2) compile and build the latest version of GraphScope from source code.
Install GraphScope via pip
The Python release package of GraphScope contains the necessary components and dependencies for the client and server runtime. Before installing GraphScope, make sure that the current environment meets the following prerequisites:
- Linux system: Ubuntu 20.04 or higher, CentOS 7 or higher
- Pre-installed GCC 7.1+
- MacOS 11+ (Intel) or MacOS 12+ (Apple silicon)
- Python >= 3.8 and pip >= 19.3
- For Windows users, you may want to install Ubuntu on WSL2 to use this package;
Install the latest version of GraphScope via pip:
$ pip3 install graphscope --upgrade
You can check if GraphScope has been installed correctly with the following Python command:
>>> import graphcope
>>> graphscope.__version__
0.21.0
If the version number of GraphScope is displayed correctly as shown above, then you can start using GraphScope. Playground and Tutorials provide some getting started examples, which will be helpful for your GraphScope journey.
Build GraphScope from Source Code
GraphScope is comprised of three engines targeting for different business scenario, and a coordinator that could bring them together, and a client for users to connect and perform various tasks.
In this section, taking the GAE engine as an example, we will introduce how to build and test a single engine, and then give a demo on how to do the e2e test where multiple engines will working together.
Dev Environment
Since compiling GraphScope requires some third-party tools and dependencies such as Vineyard, libgrape-lite, and also requires compilation frameworks for different languages such as g++, Maven, Rustc, etc. Some dependencies can be installed directly by the package manager, while others need to be installed by compiling source code. Therefore, to make life easier, we provide a a docker image based on centos7 with all required dependencies installed.
sudo docker pull registry.cn-hongkong.aliyuncs.com/graphscope/graphscope-dev:latest
Please refer to Dev Environment to find more options to get a dev environment.
Build Analytical Engine
For developers, they just need to git clone the latest version of code from the repository, make their changes to the code and build GraphScope with gs
command-line utility:
# set docker container shared memory: 10G
sudo docker run --shm-size 10240m -it registry.cn-hongkong.aliyuncs.com/graphscope/graphscope-dev:latest /bin/bash
git clone https://github.com/alibaba/GraphScope.git
# building
cd GraphScope && ./gs make analytical
You may found the built artifacts under the directory analytical_engine/build/grape_engine
, together with the grape_engine are shared libraries, or there may have a bunch of test binaries if you choose to build the tests.
Then, you could install it to a specified location by:
./gs make analytical-install --install-prefix /opt/graphscope
Test Analytical Engine
You could easily test with the new artifacts with a single command:
Here we set the working directory to local repo.
export GRAPHSCOPE_HOME=`pwd`
# Here the `pwd` is the root path of GraphScope repository
See more about GRAPHSCOPE_HOME
in run tests
./gs test analytical
It would download the test dataset to the /tmp/gstest (if not exists) and run multiple algorithms against various graphs, and compare the result with the ground truth.
You can follow the GIE Doc and GLE Doc to build and test these components.
E2E Test
Build All Targets for GraphScope
With gs
command-line utility, you can build all targets for GraphScope with a single command.
./gs make install
Run tests
Run a bunch of test cases that involves 3 engines
export GRAPHSCOPE_HOME=/opt/graphscope
./gs test e2e --local
Conclusion
In this post, we introduce two ways to install GraphScope locally. In order to process large-scale data more effectively, You can also deploy and try GraphScope on Kubernetes clusters with the help of vineyard as a distributed memory manager. For more details, please refer to the official document. We will also introduce in the subsequent articles.