Tutorial: Develop your Algorithm in C++ with PIE Model#

In this tutorial, we will demonstrate how to develop an algorithm using the PIE model in C++ programming language.

Prerequisites:#

  • Familiarity with the PIE programming model

  • Install GraphScope by using the command pip3 install graphscope

Then, We will guide you through developing a simple PIE algorithm that computes the degree for each vertex in the graph. The full API can be found in Analytical Engine API Doc and libgrape-lite API.

Step 1: Define the context class#

First, we create a context class that inherits from grape::VertexDataContext. This class will store and manage algorithm-specific data and parameters.

template <typename FRAG_T>
class MyAppContext : public grape::VertexDataContext<FRAG_T, uint64_t> {
  using oid_t = typename FRAG_T::oid_t;
  using vid_t = typename FRAG_T::vid_t;
  using vertex_t = typename FRAG_T::vertex_t;

 public:
  explicit MyAppContext(const FRAG_T& fragment)
      : grape::VertexDataContext<FRAG_T, uint64_t>(fragment, true),
        result(this->data()) {}

  void Init(grape::ParallelMessageManager& messages, int param1) {
    this->step = 0;
    this->param1 = param1;
    result.SetValue(0);
  }

  int step = 0;
  int param1 = 0;
  typename FRAG_T::template vertex_array_t<uint64_t>& result;
};

As shown in the code, the MyAppContext class defines two member variables called step and param1 to store the current superstep and algorithm-specific parameter, respectively. And we also define a member variable named result with uint64_t type to store the degree for each vertex in the fragment. The Init method is used to initialize the context of the computation. In current example, we initialize the step and param1 variables to zero and the algorithm-specific parameter. We also set the result to zero for each vertex.

Step 2: Define the Algorithm class#

Next, we define the MyApp class, which is responsible for implementing the algorithm by using the MyAppContext class.

template <typename FRAG_T>
class MyApp : public grape::ParallelAppBase<FRAG_T, MyAppContext<FRAG_T>>,
              public grape::ParallelEngine,
              public grape::Communicator {
 public:
  INSTALL_PARALLEL_WORKER(MyApp<FRAG_T>, MyAppContext<FRAG_T>, FRAG_T)
  static constexpr grape::MessageStrategy message_strategy =
      grape::MessageStrategy::kSyncOnOuterVertex;
  static constexpr grape::LoadStrategy load_strategy =
      grape::LoadStrategy::kBothOutIn;
  using vertex_t = typename fragment_t::vertex_t;

  void PEval(const fragment_t& fragment, context_t& context,
             message_manager_t& messages) {
    messages.InitChannels(thread_num());
    // We put all compute logic in IncEval phase, thus do nothing but force continue.
    messages.ForceContinue();
  }

  void IncEval(const fragment_t& fragment, context_t& context,
               message_manager_t& messages) {
    // superstep
    ++context.step;

    // Process received messages sent by other fragment here.
    {
      messages.ParallelProcess<fragment_t, double>(
          thread_num(), fragment,
          [&context](int tid, vertex_t u, const double& msg) {
            // Implement your logic here
          });
    }

    // Compute the degree for each vertex, set the result in context
    auto inner_vertices = fragment.InnerVertices();
    ForEach(inner_vertices.begin(), inner_vertices.end(),
            [&context, &fragment](int tid, vertex_t u) {
              context.result[u] =
                  static_cast<uint64_t>(fragment.GetOutgoingAdjList(u).Size() +
                                        fragment.GetIncomingAdjList(u).Size());
            });
  }
};

The MyApp class inherits from the grape::ParallelAppBase, which provides the basic functionality for implementing a parallel graph algorithm. It also inherits from the grape::ParallelEngine and grape::Communicator classes, which provide the communication and parallel processing capabilities. The MyApp class defines two static constexpr variables called message_strategy and load_strategy, these variables specify the message strategy and load strategy used in the computation. For more information please refer to the libgrape-lite Doc.

The PEval method is used to implement the partial evaluation phase of the computation. In current example, we initialize the communication channels and do nothing else, instead, we put the computing logic into IncEval method.

The IncEval method is used to implement the incremental evaluation phase of the computation. In this method, we increment the superstep counter in the context, process received messages sent by other fragments, and compute the degree for each vertex in the graph. The ForEach method is used to iterate over the inner vertices of the fragment, and the GetOutgoingAdjList and GetIncomingAdjList methods are used to get the outgoing and incoming adjacency lists of each vertex. The result for each vertex then set in the context.

Step 3: Package the Algorithm#

To make the algorithm runnable on the GraphScope, we need to package it as a .gar file. The package should include the above C++ files and a configuration file named .gs_conf.yaml with the following content:

app:
- algo: my_app
  type: cpp_pie
  class_name: gs::MyApp
  src: my_app.h
  context_type: vertex_data
  compatible_graph:
    - gs::ArrowProjectedFragment
    - gs::DynamicProjectedFragment
    - vineyard::ArrowFragment

the codebase structure is as follows:

.
├── my_app.h ➝ algorithm logics
└── my_app_context.h ➝ context with auxiliary data for the algorithm
└── .gs_conf.yaml ➝ configuration file 

then, we package the algorithm by command: zip -jr 'my_app.gar' '*.h' ''.gs_conf.yaml'

Step 4: Run the .gar file on GraphScope#

Use the following Python code to run the algorithm on GraphScope.

import graphscope

from graphscope.framework.app import load_app
from graphscope.dataset import load_p2p_network

sess = graphscope.session()
simple_graph = load_p2p_network(sess)._project_to_simple()

my_app = load_app('<path_to_your_gar_resource>')
result = my_app(simple_graph, 10)  # pass 10 as param1 defined in 'MyAppContext.h'

print(result.to_numpy('r'))

GraphScope C++ SDK with GitHub Template#

To help you develop your algorithms more efficiently, we provide a C++ template library to help you get started with your algorithm development. It includes examples and best practices for implementing PIE algorithms in C++.