Optimizing ROS2 Applications with Streaming Executor: A Performance Analysis
In the recent ROS Meetup at Robert Bosch Center for Research and advance development, Renningen brought together robotics enthusiasts and ROS developers eager to explore the latest developments in the Robot Operating System 2 (ROS2). Among the engaging presentations, one interesting topic was the in-depth discussion on ROS2 executors, led by Pablo Ghiglino from Klepsydra Technologies. In this article, we will delve into the key insights and findings from Klepsydra’s presentation, about the streaming executor and its advantages in various scenarios.

Understanding ROS2 Executors
Before diving into the streaming executor, it’s essential to understand the role of executors in ROS2 applications. Executors play a vital role in managing the flow of data by coordinating and scheduling callbacks for subscriptions, messages, services, timers, and nodes. Unlike maintaining their own queues of messages and callbacks, ROS2 executors consume messages directly from the middleware DDS queues. They then dispatch these messages for execution to one or more threads, ensuring seamless data flow within the system.

Klepsydra’s presentation covered three well-known types of ROS2 executors:
- Single Threaded Executor: This executor operates with a single thread that queries the middleware and executes callbacks sequentially. It periodically scans the application’s structure to update nodes, subscriptions, services, and more. While simple, it may not be the most suitable choice for resource-intensive workloads.
- Static Single Threaded Executor: The static single-threaded executor scans and defines the application structure only once, during construction. It creates all nodes, callback groups, timers, and subscriptions before the spin function is called. Despite its simplicity, this executor has proven effective for lightweight node work.
- Multi-Threaded Executor: The multi-threaded executor creates multiple threads to execute callbacks in parallel, optimizing performance for demanding workloads. However, managing threads can be complex and challenging.
The highlight: The Streaming Executor
Klepsydra’s presentation highlighted the unique streaming executor, which offers distinct advantages over its counterparts. Here’s how it works:
- Publisher-Subscriber Pairing: In the streaming executor, a publisher-subscriber pair is created for each topic required by a ROS2 node. These pairs are internally identified by the node name and the topic name. This design ensures that even if two different nodes publish to the same topic, they are managed independently, enhancing efficiency.
- Event Loop Management: The streaming executor manages all publisher-subscriber pairs associated with topics belonging to the same node using a shared event loop. As a result, subscribers are efficiently handled by the thread associated with their respective event loop. This approach eliminates the need for complex multi-threading management and works seamlessly on both single-core and multi-core systems.
Real-World Results: Performance Comparison

Klepsydra’s presentation brought forth compelling evidence for the superiority of the streaming executor, particularly in specific scenarios.
- Small Node Work: For lightweight workloads, the static single-threaded executor emerged as the top performer, showcasing that simplicity can translate into impressive results.
- Scaling with Workload: As the workload increased, the streaming executor demonstrated remarkable performance, closely followed by the static single-threaded executor. The streaming executor’s consistency in delivering optimal results was particularly noteworthy, as it showcased the power of stable application topology.
Performance Benchmark scenario
- The benchmark was based on the Autoware reference system. It emulates a realistic driving application.
- All measurements were taken using a Raspberry Pi 4B with ROS galactic, Ubuntu 20.04 and 4 GB of ram, a constant frequency of 1.50GHz
- Compatible setup of the reference system, and without CPU isolation
Processors tested:
- Raspberry PI 4 (reference processor for the RTWG)
- Unibap’s iXIO (NASA and Blue Origin Testbed)
- Teledyne e2v LS1046
Those who are interested to learn how this performance measurements are made, check this section.
👉🏼 Follow me for more interesting content on Robotics and ROS.
#ros #robotics #automation #devops #robotops #Networking #technology #education #personaldevelopment #docker #yocto #whatsnextrobotics #WhatsNextRobotics #contentcreator
Originally published at https://www.linkedin.com.