Overview
What is gRPC anyway?
I'm not attempting a general introduction to gRPC. See the links at the end of this page if you need to catch up ;)
So, what is gRPC? In my opinion, it's an internal project at Google that is released to the public as open source, but without proper documentation for advanced use. It's relatively simple to create a Proof of Concept project in C++ or Python (the two languages I have used with gRPC recently) - but when you wander into the "Bay of Asynchronous Requests", it's not in any way obvious what to do. Lets be clear about this: Using gRPC with C++ for production grade code is hard. It takes time to understand how Googles engineers envision that the generated server interfaces and client side stubs will be used. It takes time to set up a project with CMake to correctly generate the C++ files. It takes considerable efforts if you need to automatically build the protobuf and gRPC libraries and utilities yourself. I'll not touch the latter in this blog-series, but use the libraries and utilities that is available from my Linux distributions (I use Debian and Kubuntu).
From a C++ perspective, the async API's provide a solid foundation to implement formidable servers to handle gRPC clients. Google has prioritized high performance over type safety and C++ best practices. That's an understandable approach. I assume that they use gRPC in their massive server infrastructure. Electric power is expensive, and instructions saved in very hot paths in their server applications will add up to significant savings. Using the async interfaces is however hard. This is especially true for the legacy async interface that most people are probably still using. Each RPC (Remote Procedure Call) require lots of boilerplate code. Again, this makes sense for Google who has lots of programmers, and probably a large number of experts on gRPC. I would not be surprised to learn about internal tools that can generate most, if not all of the boilerplate code.
What are the cons and pros with the legacy async interface?
Positives
- gRPC appears to me as significantly faster and more robust than any HTTP REST solution for C++ that I am familiar with.
- Its possible to scale a server to handle a really large amount of requests per second. (I have yet to explore it's actual limits in requests per seconds with various message sizes and number of clients).
- The transport layer is abstracted away to a degree that you don't normally even think about it.
- You can use streams in both directions, something that is not available to HTTP REST. (Some popular services use websockets as a substitute for a real solution. Personally, I firmly believe that websockets was created by Satan himself on a particularly bad day).
- You don't pay the overhead of sending the attribute name with every object that goes over the wire, the way you do with Json or XML. gRPC is an extension of protobuf - which is much more efficient.
- It's easy to evolve a gRPC interface by adding new methods and expand the data-types. Done right, you can allow old and new applications to coexist, which is a requirement for highly available cloud applications.
- gRPC itself provides excellent performance and is well suited to use in C++ applications.
- You get full control of your threads. If you need to handle lots and lots of simultaneous long-time streaming clients, you can put partitions of clients on separate threads - for example a few thousand clients per thread.
Negatives
- Very complex to use efficiently. This probably don't matter for large, experienced teams with sufficient funding - but it hurts smaller teams and lone coders.
- Does not follow C++ best practices. This makes it harder to write bug-free, safe code.
- Hard to find good documentation for advanced use.
- It takes significant time to implement each RPC. (This is also true for HTTP REST requests, if you need to maintain swagger documentation and handle security properly).
- The gRPC libraries and generated code creates some bloat. The smallest server X64 linux binary I have been able to create - implementing just one unary rpc call, compiled for release with g++ 13 and then stripped, ended up at 457 kb. The smalles client-binary was 372 kb. This makes it less than ideal for farms of tiny microservices running under for example kubernetes.
What are the cons and pros with the newer async callback interface?
Positives
- All of the positives above, except control over the threads.
- Significantly simpler to use than the legacy async interface ;)
- Here, not only the transport layer - but also the "executor layer" is abstracted away. You get actual, relevant events to work with.
- The user no longer have to consider how to best use multiple threads and queues to get to a reasonable performance. The gRPC library take care of all these details.
- "Time to market" for a project adding a gRPC interface is shortened by may be 80%.
- The users must understand the constraints, but they don't need to be experts in gRPC in order to use it.
Negatives
- No support for C++ coroutines. The callback interface looks suspiciously close to what a Java interface would look like ;)
- It provides developers with too many opportunities to mess up with resource leaks, race conditions and performance issues. Using the interface correctly is still hard.
- Inconsistent, and using pointers where C++ best practices would suggest references.
Tools and requirements
I will be using CMake and C++ 20. I test my code with g++ 12 and 13, and clang 15. In addition to protobuf and grpc, I will use a recent version of the boost libraries. For logging, I use a header-only log library, logfault, that is handled automatically by CMake.
CMake and gRPC
CMake can deal with the code-generation from .proto
files required to use gRPC in your project. I have found it useful to create a static library for the protobuf and gRPC details, and just add that to the dependencies of my other libraries and/or executables.
Example:
1project(proto LANGUAGES CXX)
2
3INCLUDE(FindProtobuf)
4FIND_PACKAGE(Protobuf REQUIRED)
5find_package(gRPC CONFIG REQUIRED)
6set(GRPC_LIB gRPC::grpc gRPC::grpc++)
7
8file(GLOB PROTO_FILES "${PROJECT_SOURCE_DIR}/*.proto")
9message(" using PROTO_FILES: ${PROTO_FILES}")
10
11add_library(${PROJECT_NAME} STATIC ${PROTO_FILES})
12target_link_libraries(${PROJECT_NAME}
13 PUBLIC
14 $<BUILD_INTERFACE:protobuf::libprotobuf>
15 $<BUILD_INTERFACE:${GRPC_LIB}>
16)
17
18target_include_directories(proto PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
19get_target_property(grpc_cpp_plugin_location gRPC::grpc_cpp_plugin LOCATION)
20protobuf_generate(TARGET ${PROJECT_NAME} LANGUAGE cpp)
21protobuf_generate(TARGET ${PROJECT_NAME} LANGUAGE grpc GENERATE_EXTENSIONS .grpc.pb.h .grpc.pb.cc PLUGIN "protoc-gen-grpc=${grpc_cpp_plugin_location}")
The latest version of this example is here:
Then, when you make something that use the gRPC stuff, just mention proto in that projects CMakeLists.txt:
1add_dependencies(${PROJECT_NAME}
2 proto
3 ...
4 )
gRPC general information
Some things to read, if you are unfamiliar with gRPC.
Official info and tutorials from the gRPC team:
- Googles introduction
- A simple client and server
- An async server and client
- Proposals - like RFC's for gRPC
- C++ callback-based asynchronous API
Other relevant links:
- Lessons learnt from writing asynchronous streaming gRPC services in C++. Explains a problem with the event-queue where the tags are put on the top of the queue, in stead at the end.
- asio-grpc This library aims at integrating gRPC and asio continuations. It looks like a nice idea, but the number of source-code files and commits to the project tells a bit about the complexity required to accomplish this.