Erik Rigtorp

Generating std::ostream &operator<< for C++ enums and structs using libClang

In this article I describe how to automatically generate implementations for std::ostream &operator<< for enums and structs from C++ source code using LibClang. I use source code generation, specifically source-to-source translation to eliminate the need to manually write the implementations.

C++ doesn’t support full static reflection (also called compile time reflection). Instead we are forced to write boilerplate code in order to print every member of a struct. Fortunately it’s surprisingly easy to use LibClang to parse C++ code into it’s abstract syntax tree (AST) and use it to generate code for printing every member.

Given code like this:

enum class foo { a, b };
struct bar { foo x; int y; };

I’ll show how to automatically generate code like this:

std::ostream &operator<<(std::ostream &os, foo v) {
  switch(v) {
    case foo::a: os << "a"; break;
    case foo::b: os << "b"; break;
  }
  return os;
}
std::ostream &operator<<(std::ostream &os, const bar &v) {
  os << "x=" << v.x;
  os << ", y=" << v.y;
  return os;
}

First we define AST matchers for enums and structs:

auto EnumMatcher = enumDecl(isExpansionInMainFile()).bind("enum");

auto RecordMatcher =
    recordDecl(isExpansionInMainFile(), unless(isImplicit())).bind("record");

We then define a MatchCallback that prints out the std::ostream &operator<< implementations:

class Printer : public MatchFinder::MatchCallback {
public:
  virtual void run(const MatchFinder::MatchResult &Result) {
    if (const auto *Enum = Result.Nodes.getNodeAs<EnumDecl>("enum")) {
      // Enum->dump();
      outs() << "std::ostream & operator<<(std::ostream &os, "
             << Enum->getName() << " " << Enum->getName() << ") {\n"
             << "  switch (" << Enum->getName() << ") {\n";
      for (const EnumConstantDecl *EnumConstant : Enum->enumerators()) {
        outs() << "  case " << EnumConstant->getQualifiedNameAsString()
               << ": os << \"" << EnumConstant->getName() << "\";\n";
      }
      outs() << "  }\n  return os;\n}\n";
    }
    if (const auto *Record = Result.Nodes.getNodeAs<RecordDecl>("record")) {
      // Record->dump();
      outs() << "std::ostream & operator<<(std::ostream &os, const "
             << Record->getName() << " &v) {\n"
             << "  os << \"" << Record->getName() << "(\";\n";
      for (const FieldDecl *Field : Record->fields()) {
        bool IsFirst = Field == *Record->field_begin();
        outs() << "  os << \"" << (IsFirst ? "" : ", ") << Field->getName()
               << "=\" << v." << Field->getName() << ";\n";
      }
      outs() << "  os << \")\"\n  return os;\n}\n";
    }
  }
};

Finally we define the main function that drives libTooling to parse the input source and invoke the MatchCallback for matching AST elements:

static cl::OptionCategory GenOstreamCategory("genostream options");
static cl::extrahelp CommonHelp(CommonOptionsParser::HelpMessage);

int main(int argc, const char **argv) {
  CommonOptionsParser OptionsParser(argc, argv, GenOstreamCategory);
  ClangTool Tool(OptionsParser.getCompilations(),
                 OptionsParser.getSourcePathList());

  Printer Printer;
  MatchFinder Finder;
  Finder.addMatcher(EnumMatcher, &Printer);
  Finder.addMatcher(RecordMatcher, &Printer);

  return Tool.run(newFrontendActionFactory(&Finder).get());
}

Download the full source code for genostream.cpp.

Building libTooling tools can be a bit tricky, but this works on Fedora 32:

$ g++ -std=c++17 -Wall genostream.cpp -o genostream -lclang-cpp -lLLVM

See the libTooling documentation for more information on how to compile libTooling tools.

Now we can run it on some source file:

$ genostream -p build src/foo.cpp

build should be a directory with compile_commands.json for src/foo.cpp. If the tool fails to find stddef.h or similar headers move the binary to the same directory as clang or specify the clang resource dir: --extra-arg="-resource-dir /usr/lib64/clang/10.0.1/" on the command line. See https://clang.llvm.org/docs/LibTooling.html#builtin-includes.

This approach can also be extended to generate other serialization and deserialization functions. For example to_json/from_json functions for JSON for Modern C++ and serialize methods for Cereal and Boost.Serialization.

I also have an older implementation of the same tool in Python: genostream.py. The Python libClang bindings are more limited than the C++ API, if you need complicated AST matchers it’s best to use the C++ API.

References