A Generic Benchmark Tool¶

We provide a benchmarking tool to evaluate the performance of Interactive Engine. This tool acts as multiple clients that send queries (Gremlin, or Cypher) to the server through the corresponding endpoint exposed by the engine. It reports performance metrics such as latency, throughput, and query results.

Notably, the tool has recently been enhanced to support comprehensive comparisons of different systems and a variety of benchmark workloads, enabling thorough assessments and comparison of query correctness and performance.

Benchmark Tool Overview¶

Here are some key features of the benchmark tool:

Multiple Query Languages. The tool accommodates various graph query languages, including Gremlin and Cypher, allowing systems to configure according to their specific language support.
Different Graph Systems. It supports comparison among multiple graph systems, such as GraphScope GIE and KuzuDB. More systems will be integrated in the future.
Versatile Workload. The tool supports various workloads, including LDBC IC and BI, LSQB, and JOB.
Results Evaluation. It enables correctness validation and performance benchmarking for detailed comparisons.

Benchmark Tool Usage¶

The benchark tool is provided in here. The benchmark program sends mixed queries to the server by reading query templates from queries with filling the parameters in the query templates using substitution_parameters. The program uses a round-robin strategy to iterate all the enabled queries with corresponding parameters.

Repository contents¶

- bin
    - bench.sh                          // script for running benchmark for queries
    - collect.sh                        // script for collecting benchmark results
- config
    - interactive-benchmark.properties  // configurations for running benchmark
- data
    - substitution_parameters           // query parameter files using to fill the query templates
    - expected_results                  // expected query results for the running queries 
- queries                               // query templates including LDBC queries, LSQB queries, Job queries, customized queries, etc.
- dbs                                   // Other graph systems for comparison. Currently, KuzuDB is supported.
- example                               // an example to compare GraphScope GIE and Kuzu
- src                                   // source code of benchmark program

Note: the queries here with the prefix ldbc_query are implementations of LDBC official interactive complex reads, the queries with the prefix bi_query are implementations of LDBC official business intelligence, the queries with the prefix lsqb_query are implementations of LDBC’s labelled subgraph query benchmark, and the queries with the prefix job are the implementation of JOB Benchmark. The gremlin queries should be with suffix .gremlin, and cypher queries should be with suffix .cypher. The corresponding parameters (factor 1) for LDBC queries are generated by LDBC official tools.

Building the benchmark¶

Build benchmark program using Maven:

mvn clean package

All the binary and queries would be packed into target/benchmark-0.0.1-SNAPSHOT-dist.tar.gz, and you can use deploy the package to anywhere could connect to the gremlin endpoint (which should be provided in interactive-benchmark.properties).

Running the benchmark¶

You can unzip built target/benchmark-0.0.1-SNAPSHOT-dist.tar.gz, and run the benchmark.

cd target
tar -xvf gaia-benchmark-0.0.1-SNAPSHOT-dist.tar.gz
cd gaia-benchmark-0.0.1-SNAPSHOT
./bin/bench.sh                             # run the benchmark program. You can also modify running configurations in config/interactive-benchmark.properties

With the example configuration file example/job_benchmark.properties, which compares GraphScope-GIE and KuzuDB while executing the JOB Benchmark, the example of results are as follows:

Start to benchmark system: GIE
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3638].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[266].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3669].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[8603].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[613].
...
System: GIE; query count: 35; execute time(ms): xxx qps: xxx

Start to benchmark system: KuzuDb
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[7068].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[253].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[5122].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[13623].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[4676].
...
System: KuzuDB; query count: 35; execute time(ms): xxx qps: xxx

Collecting the results¶

./bin/collect.sh                      # run the result collection program to collect the results and generate a performance comparison table

Furthermore, based on the benchmark results, the collected data and the final performance comparison table are as follows:

QueryName	GIE Avg	GIE P50	GIE P90	GIE P95	GIE P99	GIE Count	KuzuDb Avg	KuzuDb P50	KuzuDb P90	KuzuDb P95	KuzuDb P99	KuzuDb Count
3a	613.00	613	613	613	613	1	4676.00	4676	4676	4676	4676	1
5c	8603.00	8603	8603	8603	8603	1	13623.00	13623	13623	13623	13623	1
9a	3669.00	3669	3669	3669	3669	1	5122.00	5122	5122	5122	5122	1
13a	3638.00	3638	3638	3638	3638	1	7068.00	7068	7068	7068	7068	1
32a	266.00	266	266	266	266	1	253.00	253	253	253	253	1

A more detailed end-to-end example is provided in here.

Configurations¶

All detailed configurations can be found in config/interactive-benchmark.properties.

Below we highlight some key settings.

Configure Compared Systems¶

We facilitate comparisons between various graph systems. For instance, to compare the GIE and Kuzu systems, the interactive-benchmark.properties file can be configured as follows. The Benchmark Tool will subsequently send queries to both GIE and Kuzu, gathering and analyzing their results.

# The configuration for the compared systems.
# Currently, the supported systems includes GIE and KuzuDb.
# For each system, starting from system.1 to system.n, the following configurations are needed:
# name: the name of the system, e.g., GIE, KuzuDb.
# client: the client of the system, e.g., for GIE, it can be cypher, gremlin; for KuzuDB, it should be kuzu.
# endpoint(optional): the endpoint of the system if the system provides a service endpoint, e.g., for GIE gremlin, it is 127.0.0.1:8182 by default.
# path(optional): the path of the database of the system if the system is a local database and need to access the database by the path, e.g., for KuzuDb, it can be /path_to_db/example_db.
# Either of endpoint or path need to be provided, depending on the access method of the system.
system.1.name = GIE
system.1.client = cypher
system.1.endpoint = 127.0.0.1:7687
system.1.path =
system.2.name = KuzuDb
system.2.client = kuzu
system.2.endpoint =
system.2.path = ./job_db

Configure Workloads¶

Currently, we have provided commonly used benchmark workloads including ic, bi, lsqb, and job. Users can also add their own benchmarking queries to queries as well as adding substitution parameters of queries to substitution_parameters. Note that the file name of user-defined query templates should follow the prefix custom_query or custom_constant_query. The difference between custom_query and custom_constant_query is that the latter has no corresponding parameters.

Taking JOB benchmark as an example, the related configuration is as follows:

# The configuration for the benchmarking workloads.
# the directory of query templates
query.dir = ./queries/cypher_queries/job
# the directory of query parameters. If the queries do not have parameters, leave it empty.
query.parameters.dir = 
# query file suffix, e.g., cypher (ldbc_query.cypher), gremlin (ldbc_query.gremlin), txt (ldbc_query.txt), etc.
query.file.suffix=cypher
# specify which kind of queries are sent.
# if query.all.enable is true, the benchmark will send all the queries in the query.dir.
query.all.enable=true

Configure Results Collection¶

By default, benchmark results will be output to the interactive-benchmark.log and interactive-benchmark-report.md files, as exemplified in the sections “Running the benchmark” and “Collecting the results” above. Specifically, if you want to further compare query correctness under the current workloads, you can provide the corresponding configuration:

# the directory of query results which is optional. if provided, the benchmarking results will be compared with the expected results.
query.expected.path = ./data/expected_results/job_expected.json

The benchmark tool will automatically execute the queries and compare the results for correctness.