A Generic Benchmark Tool

We provide a benchmarking tool to evaluate the performance of Interactive Engine. This tool acts as multiple clients that send queries (Gremlin, or Cypher) to the server through the corresponding endpoint exposed by the engine. It reports performance metrics such as latency, throughput, and query results.

Notably, the tool has recently been enhanced to support comprehensive comparisons of different systems and a variety of benchmark workloads, enabling thorough assessments and comparison of query correctness and performance.

Benchmark Tool Overview

Here are some key features of the benchmark tool:

  • Multiple Query Languages. The tool accommodates various graph query languages, including Gremlin and Cypher, allowing systems to configure according to their specific language support.

  • Different Graph Systems. It supports comparison among multiple graph systems, such as GraphScope GIE and KuzuDB. More systems will be integrated in the future.

  • Versatile Workload. The tool supports various workloads, including LDBC IC and BI, LSQB, and JOB.

  • Results Evaluation. It enables correctness validation and performance benchmarking for detailed comparisons.

Benchmark Tool Usage

The benchark tool is provided in here. The benchmark program sends mixed queries to the server by reading query templates from queries with filling the parameters in the query templates using substitution_parameters. The program uses a round-robin strategy to iterate all the enabled queries with corresponding parameters.

Repository contents

- bin
    - bench.sh                          // script for running benchmark for queries
    - collect.sh                        // script for collecting benchmark results
- config
    - interactive-benchmark.properties  // configurations for running benchmark
- data
    - substitution_parameters           // query parameter files using to fill the query templates
    - expected_results                  // expected query results for the running queries 
- queries                               // query templates including LDBC queries, LSQB queries, Job queries, customized queries, etc.
- dbs                                   // Other graph systems for comparison. Currently, KuzuDB is supported.
- example                               // an example to compare GraphScope GIE and Kuzu
- src                                   // source code of benchmark program

Note: the queries here with the prefix ldbc_query are implementations of LDBC official interactive complex reads, the queries with the prefix bi_query are implementations of LDBC official business intelligence, the queries with the prefix lsqb_query are implementations of LDBC’s labelled subgraph query benchmark, and the queries with the prefix job are the implementation of JOB Benchmark. The gremlin queries should be with suffix .gremlin, and cypher queries should be with suffix .cypher. The corresponding parameters (factor 1) for LDBC queries are generated by LDBC official tools.

Building the benchmark

Build benchmark program using Maven:

mvn clean package

All the binary and queries would be packed into target/benchmark-0.0.1-SNAPSHOT-dist.tar.gz, and you can use deploy the package to anywhere could connect to the gremlin endpoint (which should be provided in interactive-benchmark.properties).

Running the benchmark

You can unzip builded target/benchmark-0.0.1-SNAPSHOT-dist.tar.gz, and run the benchmark.

cd target
tar -xvf gaia-benchmark-0.0.1-SNAPSHOT-dist.tar.gz
cd gaia-benchmark-0.0.1-SNAPSHOT
./bin/bench.sh                             # run the benchmark program. You can also modify running configurations in config/interactive-benchmark.properties

With the example configuration file example/job_benchmark.properties, which compares GraphScope-GIE and KuzuDB while executing the JOB Benchmark, the example of results are as follows:

Start to benchmark system: GIE
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3638].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[266].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3669].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[8603].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[613].
...
System: GIE; query count: 35; execute time(ms): xxx qps: xxx

Start to benchmark system: KuzuDb
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[7068].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[253].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[5122].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[13623].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[4676].
...
System: KuzuDB; query count: 35; execute time(ms): xxx qps: xxx

Collecting the results

./bin/collect.sh                      # run the result collection program to collect the results and generate a performance comparison table

Furthermore, based on the benchmark results, the collected data and the final performance comparison table are as follows:

QueryName

GIE Avg

GIE P50

GIE P90

GIE P95

GIE P99

GIE Count

KuzuDb Avg

KuzuDb P50

KuzuDb P90

KuzuDb P95

KuzuDb P99

KuzuDb Count

3a

613.00

613

613

613

613

1

4676.00

4676

4676

4676

4676

1

5c

8603.00

8603

8603

8603

8603

1

13623.00

13623

13623

13623

13623

1

9a

3669.00

3669

3669

3669

3669

1

5122.00

5122

5122

5122

5122

1

13a

3638.00

3638

3638

3638

3638

1

7068.00

7068

7068

7068

7068

1

32a

266.00

266

266

266

266

1

253.00

253

253

253

253

1

A more detailed end-to-end example is provided in here.

Configurations

All detailed configurations can be found in config/interactive-benchmark.properties.

Below we highlight some key settings.

Configure Compared Systems

We facilitate comparisons between various graph systems. For instance, to compare the GIE and Kuzu systems, the interactive-benchmark.properties file can be configured as follows. The Benchmark Tool will subsequently send queries to both GIE and Kuzu, gathering and analyzing their results.

# The configuration for the compared systems.
# Currently, the supported systems includes GIE and KuzuDb.
# For each system, starting from system.1 to system.n, the following configurations are needed:
# name: the name of the system, e.g., GIE, KuzuDb.
# client: the client of the system, e.g., for GIE, it can be cypher, gremlin; for KuzuDB, it should be kuzu.
# endpoint(optional): the endpoint of the system if the sytem provides a service endpoint, e.g., for GIE gremlin, it is 127.0.0.1:8182 by default.
# path(optional): the path of the database of the system if the system is a local database and need to access the database by the path, e.g., for KuzuDb, it can be /path_to_db/example_db.
# Either of endpoint or path need to be provided, depending on the access method of the system.
system.1.name = GIE
system.1.client = cypher
system.1.endpoint = 127.0.0.1:7687
system.1.path =
system.2.name = KuzuDb
system.2.client = kuzu
system.2.endpoint =
system.2.path = ./job_db

Configure Workloads

Currently, we have provided commonly used benchmark workloads including ic, bi, lsqb, and job. Users can also add their own benchmarking queries to queries as well as adding substitution parameters of queries to substitution_parameters. Note that the file name of user-defined query templates should follow the prefix custom_query or custom_constant_query. The difference between custom_query and custom_constant_query is that the latter has no corresponding parameters.

Taking JOB benchmark as an example, the related configuration is as follows:

# The configuration for the benchmarking workloads.
# the directory of query templates
query.dir = ./queries/cypher_queries/job
# the directory of query parameters. If the queries do not have parameters, leave it empty.
query.parameters.dir = 
# query file suffix, e.g., cypher (ldbc_query.cypher), gremlin (ldbc_query.gremlin), txt (ldbc_query.txt), etc.
query.file.suffix=cypher
# specify which kind of queries are sent.
# if query.all.enable is true, the benchmark will send all the queries in the query.dir.
query.all.enable=true

Configure Results Collection

By default, benchmark results will be output to the interactive-benchmark.log and interactive-benchmark-report.md files, as exemplified in the sections “Running the benchmark” and “Collecting the results” above. Specifically, if you want to further compare query correctness under the current workloads, you can provide the corresponding configuration:

# the directory of query results which is optional. if provided, the benchmarking results will be compared with the expected results.
query.expected.path = ./data/expected_results/job_expected.json

The benchmark tool will automatically execute the queries and compare the results for correctness.