Comments (15)
I think that we can take some ideas from Connext "zero copy transfer over shared memory":
- https://community.rti.com/static/documentation/connext-micro/3.0.0/doc/html/usersmanual/zerocopy.html
- https://community.rti.com/static/documentation/connext-dds/current/doc/manuals/connext_dds/html_files/RTI_ConnextDDS_CoreLibraries_UsersManual/Content/UsersManual/SendingLDZeroCopy.htm
That's actually interprocess communication over shared memory, but something similar can be replicated using a buffer instead of a piece of shared memory.
The basic idea is that you have to ask to the publisher for a new message, instead of allocating an unique_ptr
:
msg = publisher->get_new_message();
if (msg != nullptr) {
msg->data = "asd";
publisher->publish(msg);
}
Currently, message lifetime can be extended to be longer than the scope of the callback (in cpp). That would not be possible if we go ahead with something like this (or at least, it will be really hard to implement that feature).
The implementation could live in rcl
or rmw
, I'm not sure what would be better.
from design.
@ivanpauno I don't think publisher->get_new_message()
ever return nullptr
. I'd prefer a more asynchronous way to fetch a message, or potentially blocking on that call instead. I'm not very fond of the blocking call idea, but maybe an asynchronous trigger could be set up?
Maybe it could be set up so that we can std::invoke
a callback in the publish()
function? This isn't great though, since this would need to be done in rcl
, which means it would be wasting cycles checking if there are std::bind
ed callbacks on non-shared memory platforms.
I'm not seeing a way to make this happen in anything above rmw
, except of course when there are multiple nodes inside the same process.
Sorry for the rambles, very interested in this idea.
from design.
just sharing my thought,
The implementation could live in rcl or rmw, I'm not sure what would be better.
i believe that it is better to be implemented in rmw, not rcl.
- it sounds rmw responsibility to take care of transportation. (rmw)
- provide consistent/compatible API to frontend, concealed by rmw.
- taking advantage/comparison of each rmw implementation.
from design.
Collecting some relevant parts of the previous discussion here for easier review, and to feed the design:
Re: location of implementation @gbiggs wrote
This is a tangential comment, but I wonder if we could achieve the same zero-copies-when-same-process result by reducing the number of copies requires for going into and out of the rmw layer to zero and using a DDS implementation that also supports zero copies (ignoring that there may not be any and that the standard API may not support this, both of which are solvable issues). One of the reasons for using DDS is to push all the communication issues down into an expert-vendor-supplied library, after all.
Re: location of implementation @raghaprasad wrote
How about moving the intra_process_management into an rmw ?
This rmw could handle only intra_process communication and delegate inter-process communication to a any of the chosen DDS rmw implementations.Support for zero copies is an important objective, but its not the only one. It has been observed that creating DDS participants is pretty resource heavy in terms of net memory required (atleast for FastRTPS & OpenSplice) and the discovery process is CPU intensive (due to multicast).
This new rmw could drastically simplify the discovery process and most certainly reduce the memory footprint by needing only one participant per process to support inter_process communication.
Re: smart-ptr messages @gbiggs wrote
But it is possible to do the rmw and rcl APIs and implementations such that they manage their raw pointers properly and provide a smart_ptr interface-compatible object in rclcpp. I'm not saying it would be easy, but this is how the STL is designed to be used and it would be the most powerful solution.
Re: implementation @ivanpauno wrote
I would like to see something mimicking connext Zero Copy Transfer Over Shared Memory semantics (by default connext use shared memory, but it doesn't use zero copy transfer, which have an specific semantics). Basically, instead of creating a unique pointer and then publishing it:
auto msg = std::make_unique<MSG_TYPE>(); /* Fill the message here */ publisher->publish(std::move(msg))
You ask to the publisher a piece of memory, fill it, and then publish:
auto msg = publisher->new_message(); /* Fill the message here */ publisher->publish(std::move(msg)); // I'm using move semantics because the message will be undefined after calling publish. But how we wrap the msg for this is an implementation detail.
For dds vendors that have implemented zero copy transport, this could just wrap it.
For others, we could have a default implementation that's used in those cases. That implementation could not use shared memory that allows INTERprocess zero copy transport, but just use a preallocated buffer in each publisher that allows INTRAprocess zero copy transport. This implementation is a good start for later doing something like this (if we want to do it).I also think this idea will look idiomatic in other languages (for example, in python), and performance should be quite similar.
from design.
A question: do we want to have intra-process communication always optimized in ROS2, regardless of choice of RMW?
If yes we want it always available, what about this idea?
- an independent full implementation of the RMW API -
rmw_intraprocess
- instantiate both
rmw_intraprocess
for use by nodes within the same process- The cross-process rmw implementation chosen via environment
- have
rcl
orrmw
layer route API calls to the appropriate of the two co-existing RMWs based on whether the communication is within the process
Or, this is a possible outcome, should we just expect that intraprocess communications should be the job of the choice of RMW implementation, and just push development to add this to our RMW impl of choice, e.g. FastRTPS or CycloneDDS or wherever?
from design.
How about moving the intra_process_management into an rmw ?
This rmw could handle only intra_process communication and delegate inter-process communication to a any of the chosen DDS rmw implementations.
Support for zero copies is an important objective, but its not the only one. It has been observed that creating DDS participants is pretty resource heavy in terms of net memory required (atleast for FastRTPS & OpenSplice) and the discovery process is CPU intensive (due to multicast).
This new rmw could drastically simplify the discovery process and most certainly reduce the memory footprint by needing only one participant per process to support inter_process communication.
The overhead described here is addressed by the proposal in #250 and isn't related to intra process communication. Even with intra process communication every node / participant has to perform discovery and comes with that overhead.
from design.
@ivanpauno I don't think publisher->get_new_message() ever return nullptr. I'd prefer a more asynchronous way to fetch a message, or potentially blocking on that call instead. I'm not very fond of the blocking call idea, but maybe an asynchronous trigger could be set up?
I guess that it's possible to not return ever nullptr
(probably with locking behavior), I just added it because I'm not super sure about how the implementation would be.
i believe that it is better to be implemented in rmw, not rcl.
- it sounds rmw responsibility to take care of transportation. (rmw)
- provide consistent/compatible API to frontend, concealed by rmw.
- taking advantage/comparison of each rmw implementation.
I agree, specially with the first and last points.
Each time I think about the intraprocess communication problem, I'm more convinced that it's a problem that should be addressed by the underlying middleware (FastRTPS, Connext, OpenSplice, etc), and we should only wrap their zero copy transfer API. Of course, that's probably out of our scope and we have to provide a solution on top of the middleware. But that have the cost of re-implementing a lot of things (supporting a lot of different QoS features, etc).
Or, this is a possible outcome, should we just expect that intraprocess communications should be the job of the choice of RMW implementation, and just push development to add this to our RMW impl of choice, e.g. FastRTPS or CycloneDDS or wherever?
👍
from design.
I initially posted this as an topic on answers.ros.org (see https://answers.ros.org/question/333180/ros2-micro-ros-intra-process/) but was advised by the moderator to move it to discourse... I think the core of my concern touches your discussion.
(My context: ROS2 inside a machine controller)
Looking at your proposals for intra-process communication, I fail to see whether you also take into account the multi-priority requirements such (often embedded) environments typically have.
I currently see fragmented solution elements or approaches:
-
From Micro-ROS: Multiple executors could be hosted in the same process/node, each having their own queue for messages (or in fact their handlers) of the corresponding priority (based on their handlers' callbackgroup priority).
-
From ROS2: ROS2 does not create its own queuing mechanism, but instead relies on the queues already available in the DDS middleware.
-
From ROS2 (close to this topic): use_intra_process_comms() … if true, messages will go through a special intra-process communication code path. So potentially excluding DDS. Then how will they get queued / priority managed?
-
(RTI) DDS has a Transport_Priority_QoS defined per DataWriter, which is then to be kept in sync with the cbGroup priority?
Is there any documented vision on how your intra-process-communication would co-exist with multi-priority queuing/handling?
Johan
from design.
I initially posted this as an topic on answers.ros.org (see https://answers.ros.org/question/333180/ros2-micro-ros-intra-process/) but was advised by the moderator to move it to discourse...
I did, but this is not the embedded category on ROS Discourse.
from design.
Any updates on this roughly a year later?
from design.
Any updates on this roughly a year later?
Not that I know of.
The problem isn't trivial, and AFAIK there is no people assigned to work on it.
from design.
Hi @ivanpauno, is there any work on this problem, if not do you need help? Would love to dive into it.
Cheers
from design.
AFAIK, nobody is working on this right now.
I'm not sure if there's a plan to work on the topic soon.
from design.
I'm not sure, but does the Cyclone+iceoryx combo do this automatically for C++ nodes in the same process?
from design.
I'm not sure, but does the Cyclone+iceoryx combo do this automatically for C++ nodes in the same process?
Not zero copy, zero-copy requires a different API.
from design.
Related Issues (20)
- Middleware alternatives to DDS HOT 4
- fix login by using new GitHub methods HOT 1
- Changes between ROS 1 and ROS 2 design doc is out of date HOT 1
- Add support for fully qualified names in message defnitions
- Add design document on configuring QoS at startup time HOT 4
- Add support for preemption in actions HOT 39
- Update XML schema definition for launch files HOT 2
- Add [ros2 node kill <node_name>] and [ros2 node kill --all] (similar to [rosnode kill] from ros1) HOT 23
- Article numbering is not clear HOT 3
- Topic name constraints discrepancy HOT 9
- Documentation linter HOT 5
- zero-copy: shared memory using external mapped buffer HOT 3
- is intra-process communication meta-message transfered via DDS? HOT 4
- Logging Design Document
- Update Launch XML Schema HOT 3
- can we add a "date written" to the design docs? HOT 2
- Map char[N] to str in Python
- Why must field names of messages and services be lowercase? HOT 4
- how you configured opentcs nena ros2 adapter to your agv? HOT 1
- where is the discauss of register callback and run callback? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from design.