Frequently Asked Questions#
If you don’t find an answer to your question here, feel free to file an Issues or post it to Discussions.
What are the minimum resources and system requirements required to run GraphScope?
To use GraphScope Python interface, Python >= 3.7 and pip >= 19.0 is required. GraphScope engine can be deployed in standalone mode or distributed mode. For standalone deployment, the mininum requirement is 4 cores CPU and 8G memory.
GraphScope is tested and supported on the following systems:
For distributed depolyment, a cluster managed by Kubernetes is required. GraphScope has been tested on Kubernetes version >= v1.12.0+.
Is Kubernetes an enssential to use GraphScope?
No. GraphScope supports run in standalone mode on a single machine. GraphScope pre-compiled package is distributed as a python package and can be easily installed with pip: pip3 install graphscope.
How to debug or get detailed information when using GraphScope?
By default, GraphScope is usually running in a silent mode following the convention of Python applications. To enable verbose logging, turn on it by this:
If you are running GraphScope in k8s, you can use kubectl describe/logs to check the log/status of the cluster. If the disk storage is accessible(on local or via Pods), you may also find logs in /tmp/gs/runtime/logs.
Why I find more Pods than expected with command kubectl get pod?
For the failed Pods, you may need to delete them manually by kubectl delete pod <pod-names> This case is observed when using GraphScope with helm. If users did not correctly set the role and rolebinding, the command helm uninstall <release-name> may not correctly recycle allocated resources. More details please refer to Helm Support.
Is GraphScope a graph database?
GraphScope is not a graph database, however there is a persistent storage component that can serve as database inside GraphScope called graphscope-store.
What’s the compatibility of Gremlin in GraphScope?
GraphScope supports most querying operators in Gremlin. You may check the compatibility in this link.
The system seems get stuck, what are the possible reasons?
If GraphScope seems to get stuck, the possible cause might be:
In the session launching stage, the most cases are waiting for Pods ready. The time consumption may be caused by a poor network connection during pulling image, or by failing to acquire the requested resources to launch a session.
In the graph loading stage, it is time consuming to load and build a large graph.
When running a user-defined or built-in analytical algorithm, it may take time to compile the algorithm over the loaded graph.
Why No such file or directory error when loading graph?
This mostly occurs when you are deploying GraphScope in a Kubernetes cluster, the file must be visible to the
engninePod of GraphScope. You may need to mount a volume to the Pods or use cloud storage providers.
Specifically, if your cluster is deployed with kind, you may need to setup extra-mounts to mount your local directory to kind nodes.
What’s the relationship between
k8s_vineyard_mem: The memory allocated for the vineyard container. It stores the metadata of blobs managed by vineyard, such as the shape, id, name, and so forth. As the metadata would be much smaller than datasets, the default configuration is sufficient in most cases. It’s equivalent to
vineyard.resources.memory.limitsin graphscope helm charts.
vineyard_shared_mem: The memory where the data would be loaded in. Its value needs to be adjusted according to the size of the datasets. We found that setting the value to 5 times the size of the datasets on disk is usually a reasonable value. It’s equivalent to
vineyard.shared_memin graphscope helm charts.
k8s_engine_mem: The memory of the engine pods, can just be set equal to the value of
vineyard_shared_mem. Equivalent to
engines.resources.memory.limitsin graphscope helm charts.
Failed to install GraphScope on Apple M1 with python3.8?
grpciofailed: You can try to use
opensslfrom system by
export GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=True. See more details in grpc issue.
scipyfailed: You can follow this to build scipy from source or try
pip3 install --pre -i https://pypi.anaconda.org/scipy-wheels-nightly/simple scipyto workaround this problem.
If you encounter errors like ERROR: Dependency “OpenBLAS” not found, tried pkgconfig, framework and cmake during installing scipy on MacOS, try:
export CMAKE_PREFIX_PATH=$CMAKE_PREFIX_PATH:$(brew --prefix openblas)
and run pip3 install scipy to install scipy again.
How to resolve the
Permission deniederror when allocating PV on NFS volumes?
ENV: Use helm to install graphscope-store, NFS to supply PV.
Check: First use
kubectl logs graphscope-store-zookeeper-0to check log. If the log shows
mkdir: cannot create directory '/bitnami/zookeeper/data': Permission denied.
Reason: Normally, the permission of NFS directories we created is
root 755(depends on your specific environment), but the default user of graphscope-store is
graphscope(1001), so these pods have no permission to write on NFS.
Solution: There are two solutions to solve this.
The brutal one is using
chmod 777on all related PV directories, this is efficient but not recommended in production environment.
The elegant one is creating
graphscopeuser and user group first, and then grant the access permission on
graphscopeto the related NFS directories.
Timeout Exceptionraised during launching GraphScope instance on kubernetes cluster?
It will take a few minutes for pulling image during the first time for launching GraphScope instance. Thus, the
Timeout Exceptionmay be caused by a poor network connection. You can increase the value of
timeout_secondsparameter as your expectation by
Failed to run GraphScope (either in single machine or in docker container) due to failed connection to building blocks like etcd?
It may be caused by that your machine is in an enterprise network, which requires proxy configurations to access network properly. This may lead to wrong address resolution and port occupancy. You can try to add addresses like
0.0.0.0to your environment variable
NO_PROXY(be aware of the prefix/suffix policy of no_proxy)
How to print debug info in GAE Cython SDK Algorithms?
python3 print function is a convenient way to show useful debug info, use print with param flush=True then the stream is forcibly flushed.
More details please refer to Python Documentation.
I do have many other questions…
Please feel free to contact us. You may reach us by Issues, ask questions in Discussions, or drop a message in Slack or DingTalk. We are happy to answer your questions responsively.