Coder Social home page Coder Social logo

rmw_fastrtps's Introduction

ROS 2 Middleware Implementation for eProsima's Fast DDS

rmw_fastrtps is a ROS 2 middleware implementation, providing an interface between ROS 2 and eProsima's Fast DDS middleware.

Getting started

This implementation is available in all ROS 2 distributions, both from binaries and from sources. You can specify Fast DDS as your ROS 2 middleware layer in two different ways:

  1. Exporting RMW_IMPLEMENTATION environment variable:
    export RMW_IMPLEMENTATION=rmw_fastrtps_cpp
  2. When launching your ROS 2 application:
    RMW_IMPLEMENTATION=rmw_fastrtps_cpp ros2 run <your_package> <your application>

Two different RMW implementations

rmw_fastrtps actually provides not one but two different ROS 2 middleware implementations, both of them using Fast DDS as middleware layer: rmw_fastrtps_cpp and rmw_fastrtps_dynamic_cpp (note that directory rmw_fastrtps_shared_cpp just contains the code that the two implementations share, and does not constitute a layer on its own).

The main difference between the two is that rmw_fastrtps_dynamic_cpp uses introspection typesupport at run time to decide on the serialization/deserialization mechanism. On the other hand, rmw_fastrtps_cpp uses its own typesupport, which generates the mapping for each message type at build time.

Mind that the default ROS 2 RMW implementation is rmw_fastrtps_cpp. You can however set it to rmw_fastrtps_dynamic_cpp using the environment variable RMW_IMPLEMENTATION as described above.

Advance usage

ROS 2 only allows for the configuration of certain middleware features. For example, see ROS 2 QoS policies. In addition to ROS 2 QoS policies, rmw_fastrtps sets the following Fast DDS configurable parameters:

  • History memory policy: PREALLOCATED_WITH_REALLOC_MEMORY_MODE
  • Publication mode: SYNCHRONOUS_PUBLISH_MODE
  • Data Sharing: OFF

However, rmw_fastrtps offers the possibility to further configure Fast DDS:

Change publication mode

Fast DDS features two different publication modes: synchronous and asynchronous. To learn more about the implications of choosing one mode over the other, please refer to DDS: Asynchronous vs Synchronous Publishing:

rmw_fastrtps offers an easy way to change Fast DDS' publication mode without the need of defining a XML file. That is environment variable RMW_FASTRTPS_PUBLICATION_MODE. The admissible values are:

  • ASYNCHRONOUS: asynchronous publication mode. Setting this mode implies that when the publisher invokes the write operation, the data is copied into a queue, a notification about the addition to the queue is performed, and control of the thread is returned to the user before the data is actually sent. A background thread (asynchronous thread) is in turn in charge of consuming the queue and sending the data to every matched reader.
  • SYNCHRONOUS: synchronous publication mode. Setting this mode implies that the data is sent directly within the context of the user thread. This entails that any blocking call occurring during the write operation would block the user thread, thus preventing the application with continuing its operation. It is important to note that this mode typically yields higher throughput rates at lower latencies, since the notification and context switching between threads is not present.
  • AUTO: let Fast DDS select the publication mode. This implies using the publication mode set in the XML file or, failing that, the default value set in Fast DDS (which currently is set to SYNCHRONOUS).

If RMW_FASTRTPS_PUBLICATION_MODE is not set, then both rmw_fastrtps_cpp and rmw_fastrtps_dynamic_cpp behave as if it were set to SYNCHRONOUS.

Full QoS configuration

Fast DDS QoS policies can be fully configured through a combination of the rmw QoS profile API, and the Fast DDS XML file's QoS elements. Configuration depends on the environment variable RMW_FASTRTPS_USE_QOS_FROM_XML.

  1. ROS 2 QoS contained in rmw_qos_profile_t are always honored, unless set to *_SYSTEM_DEFAULT. In that case, XML values, or Fast DDS default values in the absence of XML ones, are applied. Setting any QoS in rmw_qos_profile_t to something other than *_SYSTEM_DEFAULT entails that specifying it via XML files has no effect, since they do not override what was used to create the publisher, subscription, service, or client.
  2. In order to modify the history memory policy or publication mode using XML files, environment variable RMW_FASTRTPS_USE_QOS_FROM_XML must be set to 1 (it is set to 0 by default). This tells rmw_fastrtps that it should override both the history memory policy and the publication mode using the XML. Bear in mind that setting this environment variable but not setting either of these policies in the XML results in Fast DDS' defaults configurations being used.
RMW_FASTRTPS_USE_QOS_FROM_XML rmw QoS profile Fast DDS XML QoS Fast DDS XML history memory policy and publication mode
0 (default) Use default values Ignored - overridden by rmw_qos_profile_t Ignored - overrided by rmw_fastrtps
0 (default) Set to non system default Ignored - overridden by rmw_qos_profile_t Ignored - overrided by rmw_fastrtps
0 (default) Set to system default Used Ignored - overrided by rmw_fastrtps
1 Use default values Ignored - overridden by rmw_qos_profile_t Used
1 Set to non system default Ignored - overridden by rmw_qos_profile_t Used
1 Set to system default Used Used

Note: Setting RMW_FASTRTPS_USE_QOS_FROM_XML to 1 effectively overrides whatever configuration was set with RMW_FASTRTPS_PUBLICATION_MODE. Furthermore, If RMW_FASTRTPS_USE_QOS_FROM_XML is set to 1, and history memory policy or publication mode are not specified in the XML, then the Fast DDS' default configurations will be used:

There are two ways of telling a ROS 2 application which XML to use:

  1. Placing your XML file in the running directory under the name DEFAULT_FASTRTPS_PROFILES.xml.
  2. Setting environment variable FASTRTPS_DEFAULT_PROFILES_FILE to contain the path to your XML file (relative to the working directory).

To verify the actual QoS settings using rmw:

// Create a publisher within a node with specific topic, type support, options, and QoS
rmw_publisher_t* rmw_publisher = rmw_create_publisher(node, type_support, topic_name, qos_profile, publisher_options);
// Check the actual QoS set on the publisher
rmw_qos_profile_t qos;
rmw_publisher_get_actual_qos(rmw_publisher, &qos);

Applying different profiles to different entities

rmw_fastrtps allows for the configuration of different entities with different QoS using the same XML file. For doing so, rmw_fastrtps locates profiles in the XML based on topic names abiding to the following rules:

Creating publishers/subscriptions with different profiles

To configure a publisher/subscription, define a <publisher>/<subscriber> profile with attribute profile_name=topic_name, where topic name is the name of the topic prepended by the node namespace (which defaults to "" if not specified), i.e. the node's namespace followed by topic name used to create the publisher/subscription. Mind that topic names always start with / (it is added when creating the topic if not present), and that namespace and topic name are always separated by one /. If such profile is not defined, rmw_fastrtps attempts to load the <publisher>/<subscriber> profile with attribute is_default_profile="true". The following table presents different combinations of node namespaces and user specified topic names, as well as the resulting topic names and the suitable profile names:

User specified topic name Node namespace Final topic name Profile name
chatter DEFAULT ("") /chatter /chatter
chatter test_namespace /test_namespace/chatter /test_namespace/chatter
chatter /test_namespace /test_namespace/chatter /test_namespace/chatter
/chatter test_namespace /chatter /chatter
/chatter /test_namespace /chatter /chatter

IMPORTANT: As shown in the table, node namespaces are NOT prepended to user specified topic names starting with /, a.k.a Fully Qualified Names (FQN). For a complete description of topic name remapping please refer to Remapping Names.

Creating services with different profiles

ROS 2 services contain a subscription for receiving requests, and a publisher to reply to them. rmw_fastrtps allows for configuring each of these endpoints separately in the following manner:

  1. To configure the request subscription, define a <subscriber> profile with attribute profile_name=topic_name, where topic name is the name of the service after mangling. For more information on name mangling, please refer to Topic and Service name mapping to DDS. If such profile is not defined, rmw_fastrtps attempts to load a <subscriber> profile with attribute profile_name="service". If neither of the previous profiles exist, rmw_fastrtps attempts to load the <subscriber> profile with attribute is_default_profile="true".
  2. To configure the reply publisher, define a <publisher> profile with attribute profile_name=topic_name, where topic name is the name of the service after mangling. If such profile is not defined, rmw_fastrtps attempts to load a <publisher> profile with attribute profile_name="service". If neither of the previous profiles exist, rmw_fastrtps attempts to load the <publisher> profile with attribute is_default_profile="true".
Creating clients with different profiles

ROS 2 clients contain a publisher to send requests, and a subscription to receive the service's replies. rmw_fastrtps allows for configuring each of these endpoints separately in the following manner:

  1. To configure the requests publisher, define a <publisher> profile with attribute profile_name=topic_name, where topic name is the name of the service after mangling. If such profile is not defined, rmw_fastrtps attempts to load a <publisher> profile with attribute profile_name="client". If neither of the previous profiles exist, rmw_fastrtps attempts to load the <publisher> profile with attribute is_default_profile="true".
  2. To configure the reply subscription, define a <subscriber> profile with attribute profile_name=topic_name, where topic name is the name of the service after mangling. If such profile is not defined, rmw_fastrtps attempts to load a <subscriber> profile with attribute profile_name="client". If neither of the previous profiles exist, rmw_fastrtps attempts to load the <subscriber> profile with attribute is_default_profile="true".

Example

The following example configures Fast DDS to publish synchronously, to have a pre-allocated history that can be expanded whenever it gets filled, and to use Data Sharing if possible.

  1. Create a Fast DDS XML file with:

    <?xml version="1.0" encoding="UTF-8"?>
    <dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
        <profiles>
            <!-- Default publisher profile -->
            <publisher profile_name="default publisher profile" is_default_profile="true">
                <qos>
                    <publishMode>
                        <kind>SYNCHRONOUS</kind>
                    </publishMode>
                    <data_sharing>
                        <kind>AUTOMATIC</kind>
                    </data_sharing>
                </qos>
                <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
            </publisher>
    
            <!-- Default subscriber profile -->
            <subscriber profile_name="default subscriber profile" is_default_profile="true">
                <qos>
                    <data_sharing>
                        <kind>AUTOMATIC</kind>
                    </data_sharing>
                </qos>
                <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
            </subscriber>
    
            <!-- Publisher profile for topic helloworld -->
            <publisher profile_name="helloworld">
                <qos>
                    <publishMode>
                        <kind>SYNCHRONOUS</kind>
                    </publishMode>
                    <data_sharing>
                        <kind>AUTOMATIC</kind>
                    </data_sharing>
                </qos>
            </publisher>
    
            <!-- Request subscriber profile for services -->
            <subscriber profile_name="service">
                <qos>
                    <data_sharing>
                        <kind>AUTOMATIC</kind>
                    </data_sharing>
                </qos>
                <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
            </subscriber>
    
            <!-- Request publisher profile for clients -->
            <publisher profile_name="client">
                <qos>
                    <publishMode>
                        <kind>ASYNCHRONOUS</kind>
                    </publishMode>
                    <data_sharing>
                        <kind>AUTOMATIC</kind>
                    </data_sharing>
                </qos>
            </publisher>
    
            <!-- Request subscriber profile for server of service "add_two_ints" -->
            <subscriber profile_name="rq/add_two_intsRequest">
                <qos>
                    <data_sharing>
                        <kind>AUTOMATIC</kind>
                    </data_sharing>
                </qos>
                <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
            </subscriber>
    
            <!-- Reply subscriber profile for client of service "add_two_ints" -->
            <subscriber profile_name="rr/add_two_intsReply">
                <qos>
                    <data_sharing>
                        <kind>AUTOMATIC</kind>
                    </data_sharing>
                </qos>
                <historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
            </subscriber>
        </profiles>
    </dds>
  2. Run the talker/listener ROS 2 demo:

    1. In one terminal

      FASTRTPS_DEFAULT_PROFILES_FILE=<path_to_xml_file> RMW_FASTRTPS_USE_QOS_FROM_XML=1 RMW_IMPLEMENTATION=rmw_fastrtps_cpp ros2 run demo_nodes_cpp talker
    2. In another terminal

      FASTRTPS_DEFAULT_PROFILES_FILE=<path_to_xml_file> RMW_FASTRTPS_USE_QOS_FROM_XML=1 RMW_IMPLEMENTATION=rmw_fastrtps_cpp ros2 run demo_nodes_cpp listener

Change participant discovery options

ROS 2 allows controlling participant discovery with two environment variables: ROS_AUTOMATIC_DISCOVERY_RANGE and ROS_STATIC_PEERS. Full configuration of participant discovery can also be set with XML files; however, the ROS specific environment variables should be disabled to prevent them from interfering. Set ROS_AUTOMATIC_DISCOVERY_RANGE to the value SYSTEM_DEFAULT to disable both ROS specific environment variables. See more details for Improved Dynamic Discovery.

Enable Zero Copy Data Sharing

ROS 2 provides Loaned Messages that allow the user application to loan the messages memory from the RMW implementation to eliminate the data copy between the ROS 2 application and the RMW implementation. Furthermore, rmw_fastrtps, through Fast DDS, provides both a Shared Memory Transport and Data Sharing delivery mechanism to speed up the intra-host communication. Combining these two features (message loaning and Data Sharing), it is possible to achieve a zero-copy message delivery pipeline, thus bringing significant performance improvements to ROS 2 application.

By default, both rmw_fastrtps_cpp and rmw_fastrtps_dynamic_cpp use Shared Memory Transport for intra-host communication, along with network based transports (UDPv4) for inter-host message delivery.

In order to achieve a Zero Copy message delivery, applications need to both enable Fast DDS Data Sharing mechanism, and use the Loaned Messages API:

  1. To enable Loaned Messages in Iron Irwini or later, the only requirement is for the data type to be Plain Old Data. For Humble Hawksbill, in addition to POD types, enabling Fast DDS Data Sharing is also required.

  2. To enable Fast DDS Data Sharing delivery mechanism, the following XML profiles need to be loaded, and environment variable RMW_FASTRTPS_USE_QOS_FROM_XML needs to be set to 1 (see Full QoS configuration)

    <?xml version="1.0" encoding="UTF-8" ?>
    <profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
    
    <!-- Default publisher profile -->
    <publisher profile_name="default publisher profile" is_default_profile="true">
        <qos>
        <data_sharing>
            <kind>AUTOMATIC</kind>
        </data_sharing>
        </qos>
    </publisher>
    
    <!-- Default subscription profile -->
    <subscriber profile_name="default subscription profile" is_default_profile="true">
        <qos>
        <data_sharing>
            <kind>AUTOMATIC</kind>
        </data_sharing>
        </qos>
    </subscriber>
    </profiles>

Large data transfer over lossy network

Out of the box, Fast DDS uses UDPv4 for the data communication over the network. Although UDP has its own merit for realtime communications, with many applications relying on UDP, depending on application requirements, a more reliable network transmission may be needed. Such cases included but are not limited to sending large data samples over lossy networks, where TCP's builtin reliability and flow control tend to perform better.

Because of this reason, Fast DDS provides the possibility to modify its builtin transports via an environmental variable FASTDDS_BUILTIN_TRANSPORTS, allowing for easily changing the transport layer to TCP when needed:

export FASTDDS_BUILTIN_TRANSPORTS=LARGE_DATA

This LARGE_DATA mode adds a TCP transport for data communication, restricting the use of the UDP transport to the first part of the discovery process, thus achieving a reliable transmission with automatic discovery capabilities. This will improve the transmission of large data samples over lossy networks.

Note

The environmental variable needs to be set on both publisher and subscription side.

For more information, please refer to FASTDDS_BUILTIN_TRANSPORTS.

Quality Declaration files

Quality Declarations for each package in this repository:

Quality Declarations for the external dependencies of these packages can be found in:

rmw_fastrtps's People

Contributors

ahcorde avatar barry-xu-2018 avatar christophebedard avatar clalancette avatar codebot avatar dhood avatar dirk-thomas avatar eduponz avatar emersonknapp avatar esteve avatar fujitatomoya avatar gaoethan avatar hidmic avatar ivanpauno avatar j-rivero avatar jacobperron avatar jacquelinekay avatar jlbuenolopez avatar jwang11 avatar karsten1987 avatar lobotuerk avatar marcoag avatar miguelcompany avatar mikaelarguedas avatar mjcarroll avatar nuclearsandwich avatar richiprosima avatar richiware avatar sloretz avatar wjwwood avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rmw_fastrtps's Issues

Non-ascii character in package.xml

The "offending" character is actually in the maintainer's name, so I'm not sure how to deal with this:

<maintainer email="[email protected]">Ricardo Gonzรกlez</maintainer>

+++ Building 'rmw_fastrtps_cpp'
Running cmake because arguments have changed.
==> '. /home/racko/src/ros2-1.0.0/build/rmw_fastrtps_cpp/cmake__build.sh && /usr/sbin/cmake /home/racko/src/ros2-1.0.0/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp -DBUILD_TESTING=0 -DCMAKE_INSTALL_PREFIX=/home/racko/src/ros2-1.0.0/install' in '/home/racko/src/ros2-1.0.0/build/rmw_fastrtps_cpp'
-- The C compiler identification is GNU 7.2.1
-- The CXX compiler identification is GNU 7.2.1
-- Check for working C compiler: /usr/sbin/cc
-- Check for working C compiler: /usr/sbin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/sbin/c++
-- Check for working CXX compiler: /usr/sbin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found ament_cmake_ros: 0.4.0 (/home/racko/src/ros2-1.0.0/install/share/ament_cmake_ros/cmake)
-- Found PythonInterp: /usr/sbin/python3 (found suitable version "3.6.3", minimum required is "3") 
-- Using PYTHON_EXECUTABLE: /usr/sbin/python3
-- Found rcutils: 0.4.0 (/home/racko/src/ros2-1.0.0/install/share/rcutils/cmake)
-- Found fastrtps_cmake_module: 0.4.0 (/home/racko/src/ros2-1.0.0/install/share/fastrtps_cmake_module/cmake)
-- Found FastRTPS: /home/racko/src/ros2-1.0.0/install/include  
-- Found rmw: 0.4.0 (/home/racko/src/ros2-1.0.0/install/share/rmw/cmake)
-- Found rosidl_typesupport_introspection_c: 0.4.0 (/home/racko/src/ros2-1.0.0/install/share/rosidl_typesupport_introspection_c/cmake)
-- Found rosidl_typesupport_introspection_cpp: 0.4.0 (/home/racko/src/ros2-1.0.0/install/share/rosidl_typesupport_introspection_cpp/cmake)
Error parsing '/home/racko/src/ros2-1.0.0/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp/package.xml':
Traceback (most recent call last):
  File "/home/racko/src/ros2-1.0.0/install/share/ament_cmake_core/cmake/core/package_xml_2_cmake.py", line 145, in <module>
    main()
  File "/home/racko/src/ros2-1.0.0/install/share/ament_cmake_core/cmake/core/package_xml_2_cmake.py", line 56, in main
    raise e
  File "/home/racko/src/ros2-1.0.0/install/share/ament_cmake_core/cmake/core/package_xml_2_cmake.py", line 53, in main
    package = parse_package_string(args.package_xml.read())
  File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 409: ordinal not in range(128)
CMake Error at /home/racko/src/ros2-1.0.0/install/share/ament_cmake_core/cmake/core/ament_package_xml.cmake:94 (message):
  execute_process(/usr/sbin/python3
  /home/racko/src/ros2-1.0.0/install/share/ament_cmake_core/cmake/core/package_xml_2_cmake.py
  /home/racko/src/ros2-1.0.0/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp/package.xml
  /home/racko/src/ros2-1.0.0/build/rmw_fastrtps_cpp/ament_cmake_core/package.cmake)
  returned error code 1
Call Stack (most recent call first):
  /home/racko/src/ros2-1.0.0/install/share/ament_cmake_core/cmake/core/ament_package_xml.cmake:49 (_ament_package_xml)
  /home/racko/src/ros2-1.0.0/install/share/ament_cmake_core/cmake/core/ament_package.cmake:54 (ament_package_xml)
  CMakeLists.txt:116 (ament_package)


-- Configuring incomplete, errors occurred!
See also "/home/racko/src/ros2-1.0.0/build/rmw_fastrtps_cpp/CMakeFiles/CMakeOutput.log".
<== Command '. /home/racko/src/ros2-1.0.0/build/rmw_fastrtps_cpp/cmake__build.sh && /usr/sbin/cmake /home/racko/src/ros2-1.0.0/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp -DBUILD_TESTING=0 -DCMAKE_INSTALL_PREFIX=/home/racko/src/ros2-1.0.0/install' failed in '/home/racko/src/ros2-1.0.0/build/rmw_fastrtps_cpp' with exit code '1'

<== Command '. /home/racko/src/ros2-1.0.0/build/rmw_fastrtps_cpp/cmake__build.sh && /usr/sbin/cmake /home/racko/src/ros2-1.0.0/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp -DBUILD_TESTING=0 -DCMAKE_INSTALL_PREFIX=/home/racko/src/ros2-1.0.0/install' failed in '/home/racko/src/ros2-1.0.0/build/rmw_fastrtps_cpp' with exit code '1'
==> ERROR: A failure occurred in build().
    Aborting...

insufficient performance in the QoS demo using default parameters

The scope of this ticket is focus on the performance of the QoS demo with default values between publisher and subscriber with the same RMW implementation. The cross vendor results are only mentioned for completeness / context.

To reproduce run the easiest example from the QoS demo:

  • ros2 run image_tools cam2image -b
  • ros2 run image_tools showimage

Which means:

  • reliable (default)
  • queue depth: 10 (default)
  • publish frequency in Hz: 30 (default)
  • history QoS setting: keep all the samples (default)

The following results show significant differences in the performance depending on which RMW implementation is chosen on the publisher and subscriber side (collected with the default branches on Ubuntu 16.04 on a Lenovo P50). Only the diagonal highlighted with "quality" colors is of interest for now:

Sub \ Pub FastRTPS Connext OpenSplice
FastRTPS #ffa000 Little stuttering, not smooth Flawless Severe increasing lag, multi-seconds within seconds
Connext One second burst, short hang, repeat #00ff00 Flawless One second burst, short hang, repeat
OpenSplice Flawless Stuttering #00ff00 Flawless

When increasing the image size to -x 640 -y 480 the problems become even more apparent:

Sub \ Pub FastRTPS Connext OpenSplice
FastRTPS #f03c15 Much more stuttering Hangs for several seconds between bursts Severe increasing lag, multi-seconds within seconds
Connext One second burst, short hang, repeat #00ff00 Flawless One second burst, short hang, repeat
OpenSplice Flawless Smooth but significantly reduced framerate (even on publisher side) #00ff00 Flawless

The acceptance criteria to resolve this ticket are:

  • without changing the demo itself or the QoS settings
  • the FastRTPS publisher and subscriber should demonstrate a "good" performance which is comparable with the other vendors.

PS: please don't suggest variations of the QoS parameters in this thread but keep this ticket focused on this very specific case. Other cases can be considered separately.

Segfault on raspberry pi

Hi,
I created a test that transfers a rather big messsage (around 8Mb) via the Fast-RTPS middleware. The data is transferred between two raspberry pis running the latest version of ros2 (master branches) and also the master branch of Fast-RTPS. The network in between contains multiple switches that are configured to use rstp. When I disconnect a cable from a switch the stp tree is rebuilded by the switches and the data is transferred via another connection. This works in most cases. How ever I managed to crash the talker by reconnecting the first network cable. (See segfault attached)

Talker:
https://gist.github.com/firesurfer/1755157a8bb2ada6816301f1ef50c755
Listener:
https://gist.github.com/firesurfer/26bdd1c3ed29b96f71d5c58720d08ca5

Thread 8 "big_packet_test" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x72eff450 (LWP 3236)]
0x766f8794 in eprosima::fastrtps::rtps::ReaderProxy::set_change_to_status(eprosima::fastrtps::rtps::CacheChange_t const*, eprosima::fastrtps::rtps::ChangeForReaderStatus_t)
    () from /home/pi/workspace/ros2_ws/install/lib/libfastrtps.so
(gdb) backtrace 
#0  0x766f8794 in eprosima::fastrtps::rtps::ReaderProxy::set_change_to_status(eprosima::fastrtps::rtps::CacheChange_t const*, eprosima::fastrtps::rtps::ChangeForReaderStatus_t) () from /home/pi/workspace/ros2_ws/install/lib/libfastrtps.so
#1  0x766f3978 in eprosima::fastrtps::rtps::StatefulWriter::send_any_unsent_changes()
    () from /home/pi/workspace/ros2_ws/install/lib/libfastrtps.so
#2  0x766ee630 in eprosima::fastrtps::rtps::AsyncWriterThread::run() ()
   from /home/pi/workspace/ros2_ws/install/lib/libfastrtps.so
#3  0x76a6693c in ?? () from /usr/lib/arm-linux-gnueabihf/libstdc++.so.6
#4  0x76ebfff4 in start_thread (arg=0x72eff450) at pthread_create.c:335
#5  0x768b1b10 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:86
   from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Services not implemented in FastRTPS yet?

Am I right in the assumption that services aren't fully implemented yet in the FastRTPS middleware?

The add_two_ints_client example throws this exception at the moment:

add_two_ints_client
terminate called after throwing an instance of 'rclcpp::exceptions::RCLError'
  what():  rcl_service_server_is_available failed: not implemented, at /home/firesurfer/workspace/ros2_ws/src/eProsima/ROS-RMW-Fast-RTPS-cpp/rmw_fastrtps_cpp/src/functions.cpp:2481
Abgebrochen

After reading #48 I would expect that services are at least somehow implemented?

In case this isn't just a bug, what is currently missing in order to use services with fastRTPS?

Limitation size of String in Service

from last issue #60 with msg String limitation

base on https://github.com/rosalfred/smarthome_media_msgs/tree/ros2

make services for :

with MediaItem composed by a data field of type String https://github.com/rosalfred/smarthome_media_msgs/blob/ros2/msg/MediaItem.msg#L6

when I call MediaGetItems and return a Array of MediaItem with data field setting with String of 1024 characters, all work perfectly.
if i call MediaGetItem and return only once MediaItem with data field setting with String of 256 characters all work,

but if i call MediaGetItem and setting 1024 characters i recieve a publisher error...

RuntimeError: Failed to send request: cannot publish data, at https://github.com/ros2/rmw_fastrtps/blob/master/rmw_fastrtps_cpp/src/functions.cpp#L1795, at src/ros2/rcl/rcl/src/rcl/service.c:192

I tested with Python and Java node on Linux, and all work with simple publisher.

ICMP - Destination unreachable ( Port unreachable)

Hi as mentioned in #157 I could track down the reason for the ICMP - Destatination unreachable message one can obtain using wireshark.

How to reproduce:

  1. Start the example publisher on the first computer
  2. Start the example listener on the second computer (they should transmit messages between each other)
  3. Start wireshark
    3.1 you should now see RTPS data and Heartbeats
  4. Kill the listener (not SIGINT but realy SIGKILL -> this simulates a crash) (SIGTERM should also do it)
  5. You should now see the remaining application / computer still send RTPS Heartbeats that then get answered by ICMP - Destination unreachable ( Port unreachable)

Tested on a laptop running Debian Testing and a Raspberry Pi3 running the latest ros2 master branches.
We could reproduce this error also an a computer running ubuntu 16.04

rmw_count_subscribers / rmw_count_publishers performance improvments

The current implementation of rmw_count_subscribers and rmw_count_publishers functions as a linear runtime complexity , O(N), w.r.t. the number of ROS2 topics published. This is not a big problem when the calling code has to count the number of ROS2 subscribers/publishers for a single topic, but it becomes a lot more of a performance problem when the calling code needs to count the number of ROS2 subscribers/publishers for each topic seen on ROS2. In that case, the runtime complexity becomes O(N^2) very easily.

It should be possible to re-write these methods to bring done the runtime complexity to either a constant or at least O(logN).

Here is a screenshot of the runtime analysis for rmw_count_publishers (counting the subscribers will have the same issues).
image

onNewCacheChangeAdded can be called after node shuts down

Summary: we have "guard condition handle not from this implementation" errors in tests sometimes. it seems to me that the rmw_fastrtps_cpp code is doing the right thing, but it's a race condition in fastrtps. I opened eProsima/Fast-DDS#134 with a fix that works for me locally, this ticket is for local tracking.

A crash that can occur when the onNewCacheChageAdded of the custom ReaderListener in ROS 2 (used for listening to info about the graph e.g. topic names and types) is being called after we have removed the listener.

An example of when this is likely to happen is an executable that creates a publisher but is in the process of shutting down before it finally gets advertised. I can reproduce it pretty often with ~/ros2_ws/build_isolated/test_rclcpp/gtest_local_parameters__rmw_fastrtps_cpp --gtest_filter=test_local_parameters__rmw_fastrtps_cpp.set_parameter_if_not_set, and
here's example output from test_publisher__rmw_fastrtps_cpp on the buildfarm:

[ RUN      ] TestPublisherFixture__rmw_fastrtps_cpp.test_publisher_nominal_string
[ERROR] [rmw_fastrtps_cpp]: failed to trigger graph guard condition: guard condition handle not from this implementation, at /home/rosbuild/ci_scripts/ws/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp/src/rmw_trigger_guard_condition.cpp:31 (onNewCacheChangeAdded() at /home/rosbuild/ci_scripts/ws/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp/src/types/writer_info.hpp:105)
-- run_test.py: return code -11

That error output is from when the guard condition is triggered from within onNewCacheChageAdded, or if you're unlucky you can get:

*** Error in `/home/dhood/ros2_ws/build_isolated/test_rclcpp/gtest_local_parameters__rmw_fastrtps_cpp': double free or corruption (fasttop): 0x00007febcc007dc0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7febe483a7e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7febe484337a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7febe484753c]
/home/dhood/ros2_ws/build_isolated/test_rclcpp/gtest_local_parameters__rmw_fastrtps_cpp(_ZN9__gnu_cxx13new_allocatorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEE10deallocateEPS6_m+0x20)[0x425e64]
/home/dhood/ros2_ws/build_isolated/test_rclcpp/gtest_local_parameters__rmw_fastrtps_cpp(_ZNSt16allocator_traitsISaINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE10deallocateERS6_PS5_m+0x2b)[0x424b6e]
/home/dhood/ros2_ws/build_isolated/test_rclcpp/gtest_local_parameters__rmw_fastrtps_cpp(_ZNSt12_Vector_baseINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS5_EE13_M_deallocateEPS5_m+0x32)[0x4238aa]
/home/dhood/ros2_ws/install_isolated/rclcpp/lib/librclcpp.so(_ZNSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS5_EE19_M_emplace_back_auxIIRKS5_EEEvDpOT_+0x121)[0x7febe5447071]
/home/dhood/ros2_ws/install_isolated/rclcpp/lib/librclcpp.so(_ZNSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS5_EE9push_backERKS5_+0x69)[0x7febe54441af]
/home/dhood/ros2_ws/install_isolated/rmw_fastrtps_cpp/lib/librmw_fastrtps_cpp.so(+0x3f9c7)[0x7febe2e1a9c7]
/home/dhood/ros2_ws/install_isolated/fastrtps/lib/libfastrtps.so.1(_ZN8eprosima8fastrtps4rtps20EDPSimpleSUBListener21onNewCacheChangeAddedEPNS1_10RTPSReaderEPKNS1_13CacheChange_tE+0x56f)[0x7febe28eaccb]
/home/dhood/ros2_ws/install_isolated/fastrtps/lib/libfastrtps.so.1(_ZN8eprosima8fastrtps4rtps14StatefulReader15change_receivedEPNS1_13CacheChange_tEPNS1_11WriterProxyERSt11unique_lockISt15recursive_mutexE+0x28c)[0x7febe28597d2]
/home/dhood/ros2_ws/install_isolated/fastrtps/lib/libfastrtps.so.1(_ZN8eprosima8fastrtps4rtps14StatefulReader14processDataMsgEPNS1_13CacheChange_tE+0x3ee)[0x7febe2858944]
/home/dhood/ros2_ws/install_isolated/fastrtps/lib/libfastrtps.so.1(_ZN8eprosima8fastrtps4rtps15MessageReceiver16proc_Submsg_DataEPNS1_12CDRMessage_tEPNS1_18SubmessageHeader_tEPb+0xee7)[0x7febe2872b01]
/home/dhood/ros2_ws/install_isolated/fastrtps/lib/libfastrtps.so.1(_ZN8eprosima8fastrtps4rtps15MessageReceiver13processCDRMsgERKNS1_12GuidPrefix_tEPNS1_9Locator_tEPNS1_12CDRMessage_tE+0x6e5)[0x7febe287104d]
/home/dhood/ros2_ws/install_isolated/fastrtps/lib/libfastrtps.so.1(_ZN8eprosima8fastrtps4rtps19RTPSParticipantImpl22performListenOperationEPNS1_20ReceiverControlBlockENS1_9Locator_tE+0xab)[0x7febe28865cb]
/home/dhood/ros2_ws/install_isolated/fastrtps/lib/libfastrtps.so.1(_ZNKSt12_Mem_fn_baseIMN8eprosima8fastrtps4rtps19RTPSParticipantImplEFvPNS2_20ReceiverControlBlockENS2_9Locator_tEELb1EEclIJS5_S6_EvEEvPS3_DpOT_+0xb1)[0x7febe288be13]
/home/dhood/ros2_ws/install_isolated/fastrtps/lib/libfastrtps.so.1(_ZNSt12_Bind_simpleIFSt7_Mem_fnIMN8eprosima8fastrtps4rtps19RTPSParticipantImplEFvPNS3_20ReceiverControlBlockENS3_9Locator_tEEEPS4_S6_S7_EE9_M_invokeIJLm0ELm1ELm2EEEEvSt12_Index_tupleIJXspT_EEE+0x7b)[0x7febe288bd57]
/home/dhood/ros2_ws/install_isolated/fastrtps/lib/libfastrtps.so.1(_ZNSt12_Bind_simpleIFSt7_Mem_fnIMN8eprosima8fastrtps4rtps19RTPSParticipantImplEFvPNS3_20ReceiverControlBlockENS3_9Locator_tEEEPS4_S6_S7_EEclEv+0x2c)[0x7febe288bbbe]
/home/dhood/ros2_ws/install_isolated/fastrtps/lib/libfastrtps.so.1(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt7_Mem_fnIMN8eprosima8fastrtps4rtps19RTPSParticipantImplEFvPNS5_20ReceiverControlBlockENS5_9Locator_tEEEPS6_S8_S9_EEE6_M_runEv+0x1c)[0x7febe288bb4e]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80)[0x7febe4e5bc80]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7febe57466ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7febe48ca3dd]

ROS2 subscriber crashes with NotEnoughMemoryException with empty effort array in JointState messages

Bug report

Required Info:

Steps to reproduce issue

ros2 topic pub /eve/joint_states sensor_msgs/JointState "{name: [a,a,a,a,a,a,a], position: [128.0,128.0,128.0,128.0,128.0,128.0,128.0], velocity: [128.0,8.0,128.0,128.0,128.0,128.0,128.0]}"

ros2 topic echo /eve/joint_states sensor_msgs/JointState

Expected behavior

The ros2 topic echo process doesn't crash

Actual behavior

ros2 topic echo crashes with the following message

terminate called after throwing an instance of 'eprosima::fastcdr::exception::NotEnoughMemoryException'
  what():  Not enough memory in the buffer stream
Aborted (core dumped)

Additional information

Only crashes if the number of elements in the JointState.name, JointState.position and JointState.velocity >=7 and the number of elements in JointState.effort = 0. It seems to work if there is at least one element in effort.

Why limited the maximum size of a STRING

I not find any reference in ROS2 & IDL documentation.
But actualy the code is limited to 255.

in https://github.com/eProsima/ROS-RMW-Fast-RTPS-cpp/blob/master/rmw_fastrtps_cpp/include/rmw_fastrtps_cpp/TypeSupport_impl.h

case ::rosidl_typesupport_introspection_cpp::ROS_TYPE_STRING:
{
    auto && str = StringHelper<MembersType>::convert_to_std_string(field);

    // Control maximum length.
    if((member->string_upper_bound_ && str.length() > member->string_upper_bound_ + 1) || str.length() > 256)
     {
        throw std::runtime_error("string overcomes the maximum length");
     }
     ser << str;
}
break;

I remove the limitation and I send a String of 1024 without any problem.

Using non-default FastRTPS heartbeat period

Spin-off thread from #81 to capture the discussion on using a FastRTPS heartbeat period other than the default, originally suggested in #36.

This can be configured by un-commenting these lines.

Motivation:
Enhance performance for sending large messages with "reliable" reliability. The heartbeat from the publisher is used to get ACKNACKs from the subscriber. With the default heartbeat period (3s according to this), when something prevents messages/message fragments from being received, the subscriber can't communicate this with the publisher at a high enough frequency, so data is not re-sent often enough to have high quality connections.

Reasons against just uncommenting the lines:

  • As @mikaelarguedas pointed out: This will affect all connections with "reliable" reliability, and has potential to flood the network in large-scale systems.

Options:

  1. Reduce heartbeat period always and accept the overhead.
  2. As @mikaelarguedas suggested: Reduce heartbeat period only for particular QoS profiles, e.g. those that are commonly used for large messages. This could be an existing profile or a new one that is used for large messages with reliable reliability.
  3. Expose the heartbeat as a QoS setting, as it is also configurable in Connext (through the datawriter protocol QoS)
  4. Expose the appropriate symbols to the user so that they can set the heartbeat period themselves.
  5. Wait for eProsima to implement piggyback heartbeats, which will increase the number of heartbeats sent with more data being sent (ETA pending).

In the meantime the fix for users that need improved performance is to change the heartbeat period build rmw_fastrtps from source, manually configuring a reduced heartbeat period.

Low bandwith when using FastRTPS

I'm experiencing problems with most messages that are more than a few kilobytes per second when using FastRTPS.

For example, I just built Ros2 from source on Ubuntu 16.04 and tried running showimage in one terminal, and cam2image in another. I can see images from my webcam, but they are delayed and I get around 1 frame per second.

This also happens with downloaded binaries, but it works fine with opensplice.
The talker/listener demo and other low-bandwidth demos seem to work fine.

Best effort + fragmentation + OS X = no images

@richiprosima @JaimeMartin, we are trying to understand why particular conditions impact the reception of best effort data.

For high-bandwidth data, we have found a difference in performance on different operating systems. For example, with a single publisher publishing images (320x240 image x 1 byte/channel x 3 channels @ 30fps) and receiving them on the same machine with "best effort" reliability, it will work fine on linux, but not at all on OS X (no images are received).

  1. Using a throughput controller, as you have suggested in #81 and #36, will fix the performance on OS X, but we are trying to understand (1) why there is a difference between operating systems and (2) if there is an alternative solution, since enabling the throughput controller will limit the rate at which users can receive data.

  2. Instead of a throughput controller, disabling multicast on subscribers as suggested in this comment improves performance but only slightly.

  3. Surprisingly, performance can be improved even without a throughput controller or disabling multicast, if the publishing frequency is increased from 30fps to 100fps. At 30fps, 0 messages are received, but at 100fps, ~20% of messages are received, at 250fps, ~40% of messages are received...

Do you have an idea of what causes the messages to have low reception rates even with a single sender, and the receiver on the same machine?
@wjwwood has suggested that perhaps "the kernel's udp buffer is getting overwhelmed by the bursty nature without the flow controller" but you may have more insight given your experience.

Do you know why reception rate improves with higher publishing frequency? Thanks in advance.


To reproduce using our image demo (on OS X):

$ cam2image -b -r 0 -f 30
$ showimage -s 0 -r 0

Magic numbers in serialization

https://github.com/eProsima/ROS-RMW-Fast-RTPS-cpp/blob/master/rmw_fastrtps_cpp/src/TypeSupport.cpp#L180\

    if(vector.size() > (member->is_upper_bound_ ? member->array_size_ : (typeByDefaultLarge() ? 30 : 101))) { \
        printf("vector overcomes the maximum length\n"); \
        throw std::runtime_error("vector overcomes the maximum length"); \
    } \

https://github.com/eProsima/ROS-RMW-Fast-RTPS-cpp/blob/master/rmw_fastrtps_cpp/src/TypeSupport.cpp#L567

            case ::rosidl_typesupport_introspection_cpp::ROS_TYPE_STRING:
                current_alignment += 4 + eprosima::fastcdr::Cdr::alignment(current_alignment, 4) + (member->string_upper_bound_ ? member->string_upper_bound_ + 1 : 257);
                break;

I'm confused about the reasoning behind some of the "magic numbers" in these conditionals (30, 101, 257).

Messages are dropped when using FastRTPS on raspberry pi 3

We noticed that FastRTPS seems to drop messages when used on a raspberry pi with an x86 computer as sender but also with another raspberry pi as sender.

The used message:
msgs/StorageData

bool notavailable
string uuid
string sendernode
string key
string type
int64 componentid
uint64 unixtime
uint8[] data

The used commands for sending and recieving:
ros2 topic echo /storage_data_topic
ros2 topic pub /storage_data_topic msgs/StorageData '{"uuid": "test", "sendernode": "mynode", "data": [1,2,3,4,5,6,7,8,9,10]}'

Start listen first, then publish.
What can be observed:
It takes at least 4 messages to recieve one message when publishing from an x86 computer.
It doesn't start recieving any message if publishing is done from another pi. Restarting the listener helps. Messages will be recieved afterwards.
In general. Sometimes long delays between messages and/or message gets dropped.

The phenomena is even more extrem when used in an own application with the parameters qos profile. The subscription running on the pi won't recieve any messages of this type (other messages are working more or less fine - sometimes delayed by 10s). This also happens when listening with ros2 topic echo but sending with our own application.

Used version of ROS2: Current master branches.
x86 Computer: Debian Testing
Raspberry Pi 3: Raspbian Testing
Network topology: Multiple Switches configured with Spanning Tree Protocol (STP). Multiple raspberry pis in network.

Edit: Using wireshark I could determine there is often an ICMP Destination unreachable (Port unreachable) message with destination of either the x86 computer or the respberry pi.

Edit 2: I could determine that this issue depends on the data in the message. Example: Set "sendernode" to any data. Then send it. It takes at least three sending cycles until a message is recieved. Stopping the sending process and restarting it results in immediate recieving of the message. Stopping it, changing the data a bit, results into one or two sending cycles until a message is recieved. Changing the data a lot, like setting another field results in at least three sending cycles.

Periodic crash in services test

Following #10, I'm trying to get system_tests.test_rclcpp.test_services_cpp__rmw_fastrtps_cpp to pass reliably, and I can't. I've seen a variety of segfaults, aborts for double-free, and deadlocks. I just spent some time digging through the code and have failed to figure out the problem. If you have everything built, then you can reproduce the problem like so (I've been testing on Linux):

ulimit -c unlimited
cd src/ros2/system_tests/test_rclcpp
while nosetests3 -s ../../../../build/test_rclcpp/test_services_cpp__rmw_fastrtps_cpp.py; do true; done

You should, after a short period of time, see a problem. If there's a crash, you should get a core file. I haven't yet gotten any crashes to happen in gdb or valgrind (most of the time, I just get a deadlock in that situation).

I'm happy to provide more information to help with the investigation.

Aborted due to: Address already in use

Hello,

we have an set up with 4 nodes slitted up on four separate computers. Until now we are using Opensplice as DDS-Interface. But with some more than two nodes the messages got transmitted in batches. So we decide to switch over to FastRTPS (because it is currently actively supported).

But at the beginning the program aborted:

pi:test$ testprog
running in master mode
starting node testnode
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::system::system_error> >'
  what():  set_option: Address already in use
Aborted

The programs run on multiple Raspberry Pi 2 with Rasbian upgraded to Stretch. Our communication is done via an additional 5 GHz WLAN-USB-dongle. On some discussions I recognised that FastRTPS uses only one network interface and chooses for that randomly one of them. On the Raspberry we have three:

pi:test$ ifconfig 
eth0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether b8:27:eb:5d:cb:31  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1  (Local Loopback)
        RX packets 2  bytes 98 (98.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2  bytes 98 (98.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.62  netmask 255.255.255.0  broadcast 192.168.0.255
        inet6 fe80::f4ee:1035:89c8:5ce  prefixlen 64  scopeid 0x20<link>
        ether 80:3f:5d:21:f2:ee  txqueuelen 1000  (Ethernet)
        RX packets 16938  bytes 4609241 (4.3 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1242  bytes 139618 (136.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

A side note: We are using ssh to work on the Raspberries using the same WLAN (wlan0). The eth0 is not in use.

How to setup/configure the program correctly to get it running.

Thanks in advance.

After update of ros2 workspace -> introduces build error - UnicodeDecodeError

After updating the ros2 workspace (I'm going with the master branches) the rmw_fastrtps_cpp package fails to build with the error:

File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 409: ordinal not in range(128)

Operatingsystem: Debian testing
Compiler: clang 5.0.1
Buildsystem: Ninja
Python Version: Python 3.6.5rc1

It also happens when using GCC 7.3 and make

According to stackoverflow that might be a unicode problem with python, but I couldn't figure out where exactly to fix it in ament:
https://stackoverflow.com/questions/18649512/unicodedecodeerror-ascii-codec-cant-decode-byte-0xe2-in-position-13-ordinal

EDIT:
I guess it is due to a python3 or python3 package upgrade.
In the last few days I had the following package upgrade regarding python:

python3-scour:amd64 (0.36-2, 0.36-3)
python3-pkg-resources:amd64 (38.5.2-1, 39.0.1-1)
python3-setuptools:amd64 (38.5.2-1, 39.0.1-1)
python3-gi-cairo:amd64 (3.26.1-2, 3.28.1-1)
python3-gi:amd64 (3.26.1-2, 3.28.1-1)
python3-nose:amd64 (1.3.7-3, 1.3.7-4)
python3-apt:amd64 (1.4.0beta3+b1, 1.6.0rc2)

The complete build log for that package is:

+++ Building 'rmw_fastrtps_cpp'
==> '. /home/firesurfer/workspace/ros2_ws/build/rmw_fastrtps_cpp/cmake__build.sh && /usr/bin/cmake -DBUILD_TESTING=0 -DAMENT_CMAKE_SYMLINK_INSTALL=1 -DCMAKE_CXX_FLAGS=-fuse-ld=gold -DCMAKE_BUILD_TYPE=Debug -G Ninja -DCMAKE_INSTALL_PREFIX=/home/firesurfer/workspace/ros2_ws/install /home/firesurfer/workspace/ros2_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp' in '/home/firesurfer/workspace/ros2_ws/build/rmw_fastrtps_cpp'
-- Found ament_cmake_ros: 0.4.0 (/home/firesurfer/workspace/ros2_ws/install/share/ament_cmake_ros/cmake)
-- Using PYTHON_EXECUTABLE: /usr/bin/python3
-- Override CMake install command with custom implementation using symlinks instead of copying resources
-- Found rcutils: 0.4.0 (/home/firesurfer/workspace/ros2_ws/install/share/rcutils/cmake)
-- Found fastrtps_cmake_module: 0.4.0 (/home/firesurfer/workspace/ros2_ws/install/share/fastrtps_cmake_module/cmake)
-- Found rmw: 0.4.0 (/home/firesurfer/workspace/ros2_ws/install/share/rmw/cmake)
-- Found rosidl_typesupport_introspection_c: 0.4.0 (/home/firesurfer/workspace/ros2_ws/install/share/rosidl_typesupport_introspection_c/cmake)
-- Found rosidl_typesupport_introspection_cpp: 0.4.0 (/home/firesurfer/workspace/ros2_ws/install/share/rosidl_typesupport_introspection_cpp/cmake)
Error parsing '/home/firesurfer/workspace/ros2_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp/package.xml':
Traceback (most recent call last):
File "/home/firesurfer/workspace/ros2_ws/install/share/ament_cmake_core/cmake/core/package_xml_2_cmake.py", line 145, in
main()
File "/home/firesurfer/workspace/ros2_ws/install/share/ament_cmake_core/cmake/core/package_xml_2_cmake.py", line 56, in main
raise e
File "/home/firesurfer/workspace/ros2_ws/install/share/ament_cmake_core/cmake/core/package_xml_2_cmake.py", line 53, in main
package = parse_package_string(args.package_xml.read())
File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 409: ordinal not in range(128)
CMake Error at /home/firesurfer/workspace/ros2_ws/install/share/ament_cmake_core/cmake/core/ament_package_xml.cmake:94 (message):
execute_process(/usr/bin/python3
/home/firesurfer/workspace/ros2_ws/install/share/ament_cmake_core/cmake/core/package_xml_2_cmake.py
/home/firesurfer/workspace/ros2_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp/package.xml
/home/firesurfer/workspace/ros2_ws/build/rmw_fastrtps_cpp/ament_cmake_core/package.cmake)
returned error code 1
Call Stack (most recent call first):
/home/firesurfer/workspace/ros2_ws/install/share/ament_cmake_core/cmake/core/ament_package_xml.cmake:49 (_ament_package_xml)
/home/firesurfer/workspace/ros2_ws/install/share/ament_cmake_core/cmake/core/ament_package.cmake:63 (ament_package_xml)
CMakeLists.txt:119 (ament_package)

-- Configuring incomplete, errors occurred!
See also "/home/firesurfer/workspace/ros2_ws/build/rmw_fastrtps_cpp/CMakeFiles/CMakeOutput.log".

<== Command '. /home/firesurfer/workspace/ros2_ws/build/rmw_fastrtps_cpp/cmake__build.sh && /usr/bin/cmake -DBUILD_TESTING=0 -DAMENT_CMAKE_SYMLINK_INSTALL=1 -DCMAKE_CXX_FLAGS=-fuse-ld=gold -DCMAKE_BUILD_TYPE=Debug -G Ninja -DCMAKE_INSTALL_PREFIX=/home/firesurfer/workspace/ros2_ws/install /home/firesurfer/workspace/ros2_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp' failed in '/home/firesurfer/workspace/ros2_ws/build/rmw_fastrtps_cpp' with exit code '1'
<== Command '. /home/firesurfer/workspace/ros2_ws/build/rmw_fastrtps_cpp/cmake__build.sh && /usr/bin/cmake -DBUILD_TESTING=0 -DAMENT_CMAKE_SYMLINK_INSTALL=1 -DCMAKE_CXX_FLAGS=-fuse-ld=gold -DCMAKE_BUILD_TYPE=Debug -G Ninja -DCMAKE_INSTALL_PREFIX=/home/firesurfer/workspace/ros2_ws/install /home/firesurfer/workspace/ros2_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp' failed in '/home/firesurfer/workspace/ros2_ws/build/rmw_fastrtps_cpp' with exit code '1'

Sending string results into eprosima::fastcdr::exception::NotEnoughMemoryException

When sending this string:
"No environment file found at /etc/systemd/system/iboss_environment, creating own
QIODevice::write (QFile, "/etc/systemd/system/iboss_environment"): device not open"

I run at the recieving node into this exeception:

terminate called after throwing an instance of 'eprosima::fastcdr::exception::NotEnoughMemoryException'
  what():  Not enough memory in the buffer stream
Abgebrochen

This doesnt happen if I use the ros2 tool:

 ros2 topic  pub /ros2_log ros2imple_logger/LoggingMessage "{\"level\":3, \"nodename\":\"Scheduling_Node100\", \"message\":\"No environment file found at /etc/systemd/system/iboss_environment, creating own\nQIODevice::write (QFile, \\\"/etc/systemd/system/iboss_environment\\\"): device not open\"}"

After further research I found that using rmw_qos_profile_sensor_data was the cause of this trouble. Using the default profile (what the ros2 tool probably does doesn't invoke this error"

Enabling FastRTPS' ThroughputController

This was originally discussed in #36
It can be enabled by un-commenting these lines

Motivation:

  • Enhance performance for sending large messages with "best effort" reliability (otherwise all message fragments are sent at the same time, something happens (yet to be determined what) that prevents data from being received, and as it's best effort the fragments are not resent)

Reasons against:

  • This can limit the throughput of systems.

Options:

  1. Enable only if best effort is being used. If users want higher throughput than the controller they will have to use reliable reliability.
  2. Enable always and accept the limited throughput.
  3. Expose a flow controller QoS setting, as it's also available in Connext see this page
  4. Expose the appropriate symbols to the user so that they can set the throughput controller themselves.

If we decide to go for 1 or 2, it may be useful to know that for the image_tools demo sending 320x240 @30fps we send ~8million bytes per second. A throughput controller of ~30,000 bytes/10ms (3million/s) is sufficient.

Repeated service calls fail (at least for parameters)

This test, which sets some parameters and then tries to retrieve them 10 times in a row:

ros2/system_tests@0c7f551

fails for FastRTPS, producing output like this:

[ RUN      ] test_local_parameters__rmw_fastrtps_cpp.local_synchronous_repeated
iteration: 0
iteration: 1
unknown file: Failure
C++ exception with description "failed to send request: cannot publish data, at /home/gerkey/ros2_ws_debug/src/eProsima/ROS-RMW-Fast-RTPS-cpp/rmw_fastrtps_cpp/src/functions.cpp:1619, at /home/gerkey/ros2_ws_debug/src/ros2/rcl/rcl/src/rcl/client.c:170" thrown in the test body.

That is, the first iteration works, but subsequent ones fail. The same test passes for OpenSplice (and, I'm pretty sure, Connext).

rmw_fastrtps does not detect the version correctly when checking RMW_IMPLEMENTATION on Windows

Testing https://github.com/ros2/ros2/wiki/Working-with-multiple-RMW-implementations#ros-2-ardent-and-later

Against the binary build on windows: http://ci.ros2.org/view/packaging/job/packaging_windows/825/

It fails to error correctly when evaluating an invalid RMW_IMPLEMENTATION however the opensplice build works as expected.

>ros2 run demo_nodes_cpp talker
[ERROR] [rcl]: Error getting RMW implementation identifier.

Fuller console outputs https://gist.github.com/tfoote/1533d84fa2668127601313ab91883735

However it appears to pass in CI with only fastrtps: http://ci.ros2.org/job/ci_windows/3845/testReport/junit/rcl.build.rcl.test.test/test_rmw_impl_id_check__rmw_fastrtps_cpp_Release/test_rmw_implementation_env/

rmw_count_subscribers()/publishers() doesn't take rt topic prefix into account

When trying to get an accurate count of publishers or subscribers to a topic by calling node->count_publishers("topic_name"), the underlying RMW implementation does not take the topic prefix name into account. For instance, if you start publishing a topic called "map" in a node by calling:

    occupancy_grid_publisher_ =
      node_handle_->create_publisher<::nav_msgs::msg::OccupancyGrid>("map", custom_qos_profile);

And then you subscribe to the topic with:

$ rostopic_echo_py nav_msgs/OccupancyGrid map

And then you try to get a count of subscribers in the original node by calling:

int subs = node_handle_->count_subscribers("map");

It will always return 0. Looking at the code a bit, the problem seems to be that down in rmw_fastrtps_cpp/src/functions.cpp:rmw_count_subscribers(), the topic name that comes from the user does not have the rt prefix, but the list of topics that is built up expects the rt prefix. A simple solution is to add the rt prefix to the passed-in string before searching the list. rmw_count_publishers() has the same problem.

memory leak in topic_name of subscription and publisher (at least)

See:

Really the whole file should be audited. Also the new can throw a C++ exception that would raise through the rmw C interface which is bad. It should either use malloc or rmw_allocate or the new rcutils_allocator_t or it should try-catch the new.

missing functionality in graph API

While working on local graph changes for Connext I noticed two things about the FastRTPS implementation:

  • publishers and subscribers do not get removed from topic_names_and_types ever (even when destroyed)
  • count publisher and subscribers does not appear to count the publishers and subscribers but rather the number of different types on a topic

This can be shown using the graph tests in rcl, which are currently disabled for Fast RTPS since they don't pass:

ros2/rcl@596c0ec#diff-e4167ceebe2946426e927f55e4392af4R66

I didn't want to hold up the other PR's to fix this. I spent a few minutes looking at it, and I think the first issues could be addressed by storing the publisher/subscriber GUID's. The second issue will require being able to tell the difference in a publisher or subscriber coming up versus going down. I wasn't able to figure out how to do that and I wasn't even sure there was any notification that they went down at all.

You can see here that the topicNtypes map is inserted into:

https://github.com/eProsima/ROS-RMW-Fast-RTPS-cpp/blob/master/rmw_fastrtps_cpp/src/functions.cpp#L353

But as far as I can see nothing is ever removed. Maybe someone can give us a lead on how to accomplish this using the data coming into that function. You can see similar, but slightly different code for how we accomplish this in Connext:

https://github.com/ros2/rmw_connext/blob/master/rmw_connext_shared_cpp/src/shared_functions.cpp#L78-L82

Validate the CustomServiceRequest object

Bug report

Required Info:

  • Operating System:
    Ubuntu 16.04
  • Installation type:
    From source
  • Version or commit hash:
    8ee72cd

Expected behavior

We must check the request.buffer_ before constructing the class of eprosima::fastcdr::Cdr object

CustomServiceRequest request = info->listener_->getRequest();

if (request.buffer_ != nullptr) {
  eprosima::fastcdr::Cdr deser(*request.buffer_, eprosima::fastcdr::Cdr::DEFAULT_ENDIAN,
                                                  eprosima::fastcdr::Cdr::DDS_CDR);
...

Actual behavior

Not checked, suspected commit 636721d

I am going to submit a PR to fix this.

RTPS_WRITER error when publishing large amount of data

Hi! I'm running into trouble when publishing large amounts of data. Without a subscriber, the node runs fine without any errors. But if I have a second node subscribing to this topic, I run into a bunch of errors on the publisher end. -

[RTPS_WRITER Error] Problem adding DATA_FRAG submsg to the CDRMessage, buffer too small -> Function prepareDataFragSubM
[RTPS_WRITER Error] A problem occurred when adding a message -> Function send_Changes_AsData

And my Subscriber seg faults with this error -

terminate called after throwing an instance of 'std::out_of_range'
what():  vector::_M_range_check: __n (which is 1) >= this->size() (which is 1)

The behaviour remains the same irrespective of having a c++ or python subscriber.

FastRtps Topic & ROS2 Topic List

Hi,

When using the HelloWorld example of the C++ example in ROS2, it appears in the node list, but not in the topic list. (using "ros2 topic list" command)
Of course, when "node_demos_cpp" is used, it appears in the Topic list.

Is there anything I have to do separately?

Does not the topic list of ROS2 fetch the topic list registered in DDS?

Best effort publisher stops sending to local subscribers when remotes connect

Issue:

With the 1.5.0 fastrtps release the following issue has appeared:

  1. best effort publisher talking to best effort subscriber(s) on the same machine.
  2. best effort subscriber connects on the same domain ID but a remote machine.
  3. best effort subscriber connected remotely receives message, but the local subscriber(s) does not.
  4. when the remote subscriber is killed, the local subscribers receive messages again.
  5. repeat. whenever the remote subscriber is connected, the local ones do not receive messages.

Steps to reproduce:
Machine A:

talker_qos_py -n 100
listener_qos_py -n 100

These use /chatter_qos topic and best effort publisher and subscriber. They talk fine.

Machine B:

ROS_DOMAIN_ID=<machine_A_domain_ID> listener_qos_py -n 100

Receives messages and causes listeners on machine A to stop receiving messages.


Workaround:

The issue does not happen if the publisher offers reliable (even if the subscriber and therefore connection is best effort).

Example:

Machine A:

talker -t chatter_qos
listener_qos_py -n 100

This starts a publisher that offers reliable reliability publishing on the chatter_qos topic. (they should still communicate with best effort).

Machine B:

ROS_DOMAIN_ID=<machine_A_domain_ID> listener_qos_py -n 100

Ideas:

@wjwwood and I were brainstorming and wonder if there have been changes to the unicast/multicast settings of publishers. maybe the best effort publisher switches between unicast and multicast when subscribers are present both locally and remote, but there's a but that gets the loopback cut off in the process.

I have not yet tried to reproduce this with ros-agnostic fastrtps demos. As there's a workaround, this isn't necessarily a blocker for beta2 demos provided we switch publishers to reliable.

rmw_count_publisher|subscriber - excessive string allocation

Bug report

Required Info:

  • Operating System:
    • Ubuntu 16.04
  • Installation type:
    • source
  • Version or commit hash:
    • HEAD
  • DDS implementation:
    • Fast-RTPS
  • Client library (if applicable):
    • N/A

Steps to reproduce issue

You can reproduce this in multiple ways but the easiest is to create a large number of ROS2 publishers and subscribers and run the dynamic bridge, which counts the ROS2 pub/subs every second.

Behavior

CPU utilization is higher than expected. Profiling shows a hotspot with operator new and cfree being called, which backtracks to the functions rmw_count_publishers and rmw_count_subscribers being called in the dnamic bridge. Specifically the pushback into the unfiltered_topics vector e.g. in rmw_count_subscribers:

std::map<std::string, std::vector<std::string>> unfiltered_topics;
  ReaderInfo * slave_target = impl->secondarySubListener;
  slave_target->mapmutex.lock();
  for (auto it : slave_target->topicNtypes) {
    for (auto & itt : it.second) {
      // truncate the ROS specific prefix
      auto topic_fqdn = _demangle_if_ros_topic(it.first);
      unfiltered_topics[topic_fqdn].push_back(itt);
    }
  }
  slave_target->mapmutex.unlock()

Similar code exists for the rmw_count_publishers function.

Code inspection shows that the vector values held by the unfiltered_topics map aren't used at all and that it's just the number we care about. Further, all of the topic_fqdn keys aren't really necessary since we're only really interested in one. The function only uses it for the purposes of a debug message.

Proposed fix

To reduce string allocation, drop the non-matching topic_fqdn keys and don't bother pushing the topic types onto the vector. Just count the number of topic types for matching topics and return.

Bad rmw_count_subscribers value

I am using the master branch with Fast-RTPS.

I've been testing the ros1_bridge with one ros1 talker and two ros2 listener.

pre_run

When running, I checked on introspection that it only showed that there was 1 subscriber in /chatter, instead of the two that were active.

run

When I interrupted one of the ros2 listener, the other stopped receiving, and the instrocpection shows that there is no subscribers for that topic.

final

I have debugged a bit and I have reached rmw_fastrtps/rmw_functions.cpp, where it seems that the rmw_count_subscribers function does not return the correct value.

rmw_fastrtps CustomParticipantInfo structure defination.

typedef struct CustomParticipantInfo
{
eprosima::fastrtps::Participant * participant;
ReaderInfo * secondarySubListener;
WriterInfo * secondaryPubListener;
rmw_guard_condition_t * graph_guard_condition;
} CustomParticipantInfo;

What is the purpose of the secondarySubListener in the above strucutre, which is implemented in custom_participant_info.hpp ?

Node creation stalls when using asio library within node

Bug report

When using asio as part of a driver (e.g. serial driver) within a node, the construction of rclcpp::Node blocks when using the FastRTPS rmw implementation.

Required Info:

  • Operating System: Ubuntu 16.04
  • Installation type: binary
  • Version or commit hash: beta3
  • DDS implementation: rmw_fastrtps
  • Client library (if applicable): rclcpp

Steps to reproduce issue

Expected behavior

When creating the node the constructor of SerialNodeshould be called and print constructing... and done....

Actual behavior

The construction of SerialNode stalls and neither of the outputs in the contructor is printed.

Additional information

gdb backtrace:

#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007ffff7bc3dbd in __GI___pthread_mutex_lock (mutex=0x72d2f8) at ../nptl/pthread_mutex_lock.c:80
#2  0x00000000004060ba in asio::detail::posix_mutex::lock() ()
#3  0x00000000004072d8 in asio::detail::scoped_lock<asio::detail::posix_mutex>::scoped_lock(asio::detail::posix_mutex&) ()
#4  0x00000000004068b6 in asio::detail::epoll_reactor::deregister_descriptor(int, asio::detail::epoll_reactor::descriptor_state*&, bool) ()
#5  0x00007ffff4af3ff8 in ?? () from /opt/ros/r2b3/lib/libfastrtps.so.1
#6  0x00007ffff4af7ee7 in eprosima::fastrtps::rtps::UDPv4Transport::init() () from /opt/ros/r2b3/lib/libfastrtps.so.1
#7  0x00007ffff4ac1899 in eprosima::fastrtps::rtps::NetworkFactory::RegisterTransport(eprosima::fastrtps::rtps::TransportDescriptorInterface const*) () from /opt/ros/r2b3/lib/libfastrtps.so.1
#8  0x00007ffff4acc42f in eprosima::fastrtps::rtps::RTPSParticipantImpl::RTPSParticipantImpl(eprosima::fastrtps::rtps::RTPSParticipantAttributes const&, eprosima::fastrtps::rtps::GuidPrefix_t const&, eprosima::fastrtps::rtps::RTPSParticipant*, eprosima::fastrtps::rtps::RTPSParticipantListener*) () from /opt/ros/r2b3/lib/libfastrtps.so.1
#9  0x00007ffff4ace76a in eprosima::fastrtps::rtps::RTPSDomain::createParticipant(eprosima::fastrtps::rtps::RTPSParticipantAttributes&, eprosima::fastrtps::rtps::RTPSParticipantListener*) ()
   from /opt/ros/r2b3/lib/libfastrtps.so.1
#10 0x00007ffff4ad1a0e in eprosima::fastrtps::Domain::createParticipant(eprosima::fastrtps::ParticipantAttributes&, eprosima::fastrtps::ParticipantListener*) () from /opt/ros/r2b3/lib/libfastrtps.so.1
#11 0x00007ffff5021b3a in ?? () from /opt/ros/r2b3/lib/librmw_fastrtps_cpp.so
#12 0x00007ffff50233e8 in rmw_create_node () from /opt/ros/r2b3/lib/librmw_fastrtps_cpp.so
#13 0x00007ffff63500aa in rcl_node_init () from /opt/ros/r2b3/lib/librcl.so
#14 0x00007ffff7962dcc in rclcpp::node_interfaces::NodeBase::NodeBase(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<rclcpp::context::Context>) () from /opt/ros/r2b3/lib/librclcpp.so
#15 0x00007ffff796197c in rclcpp::node::Node::Node(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<rclcpp::context::Context>, bool) () from /opt/ros/r2b3/lib/librclcpp.so
#16 0x00007ffff7961e7a in rclcpp::node::Node::Node(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) () from /opt/ros/r2b3/lib/librclcpp.so
#17 0x00000000004070a4 in SerialNode::SerialNode() ()
#18 0x0000000000408610 in void __gnu_cxx::new_allocator<SerialNode>::construct<SerialNode>(SerialNode*) ()
#19 0x00000000004084ed in void std::allocator_traits<std::allocator<SerialNode> >::construct<SerialNode>(std::allocator<SerialNode>&, SerialNode*) ()
#20 0x0000000000408302 in std::_Sp_counted_ptr_inplace<SerialNode, std::allocator<SerialNode>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<>(std::allocator<SerialNode>) ()
#21 0x0000000000408013 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<SerialNode, std::allocator<SerialNode>>(std::_Sp_make_shared_tag, SerialNode*, std::allocator<SerialNode> const&)
    ()
#22 0x0000000000407ea8 in std::__shared_ptr<SerialNode, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<SerialNode>>(std::_Sp_make_shared_tag, std::allocator<SerialNode> const&) ()
#23 0x0000000000407de0 in std::shared_ptr<SerialNode>::shared_ptr<std::allocator<SerialNode>>(std::_Sp_make_shared_tag, std::allocator<SerialNode> const&) ()
#24 0x0000000000407cfc in std::shared_ptr<SerialNode> std::allocate_shared<SerialNode, std::allocator<SerialNode>>(std::allocator<SerialNode> const&) ()
#25 0x00000000004079c1 in std::shared_ptr<SerialNode> std::make_shared<SerialNode>() ()
#26 0x0000000000405ae6 in main ()

There seems to be a dead-lock in the asio event loop that is caused by having multiple asio::io_service (one from fastrtps, one from the driver library) in the same process.

This issue does not occur when using a rmw_opensplice_cpp, e.g.:

$ RMW_IMPLEMENTATION=rmw_opensplice_cpp ./install/lib/serial_driver_node/serial_driver_node 
creating node...
[...]
constructing...
done...

The issue is probably related to asio issue chriskohlhoff/asio#180.
I am using asio version 1.10.6 (Ubuntu 16.04 repo) and have not yet tried to build ros2 from source using a newer asio version.

repeated service response not received

One of our tests (test_client_scope_cpp in test_rclcpp) creates a service client, calls the service and checks the response. After that it does the same cycle a second time. While it works fine on all other rmw implementation with FastRTPS the second response is never received. The server receives both requests and replies both time. But the client seems to never receive the second reply. Any help what we could do to narrow it down would be welcome.

Btw. the issue happens on all platforms and build types (http://ci.ros2.org/view/nightly/job/nightly_linux_debug/lastCompletedBuild/testReport/).

audit rmw_fastrtps_cpp

meta-ticket, follow-up of #107 :
includes

  • audit calls to malloc and new and replace them if appropriate
  • audit any function call that can throw and either catch them or replace them
  • remove magic numbers: partially captured in #72
  • remove irrelevant todos
  • fix cleanup on all failure scenarios: catprued in #33
  • split functions.cpp in digestable files: done in #135
  • replace assert with return with proper value and error messages
  • audit sanity checks to make sure they make sense (e.g. #138 (comment))

(feel free to complete this list I likely missed a few important items)

Payload size exceeding pre-allocated limit.

Bug report

Required Info:

  • Operating System:

    • Windows 10, MacOS 10.12, Ubuntu Xenial
  • Installation type:

    • Build from source on ci.ros2.org
  • Version or commit hash:

  • DDS implementation:

    • Fast-RTPS

Steps to reproduce issue

Reproduced in the following nightly tests on ci.ros2.org

From @mikaelarguedas: this is reproducible on all platforms with the branch from ros2/rcl_interfaces#32

Error log excerpt

18:25:55 8: [RTPS_HISTORY Error] Change payload size of '6736' bytes is larger than the history payload size of '5000' bytes and cannot be resized. -> Function add_change
18:25:55 8: [RTPS_HISTORY Error] Change payload size of '6736' bytes is larger than the history payload size of '5000' bytes and cannot be resized. -> Function add_change
18:25:55 8: [RTPS_HISTORY Error] Change payload size of '6736' bytes is larger than the history payload size of '5000' bytes and cannot be resized. -> Function add_change

Additional information

Although I was actually testing something else, a diagnostic job exhibited the error during communication with between fastrtps and opensplice. That job is pending deletion so the link will not be good for long but I've downloaded the full 39MB log if that is of specific interest. It's too big to go into a gist.

From @mikaelarguedas:

From what I know it comes from this default value https://github.com/eProsima/Fast-RTPS/blob/603580b5186386e47fdc4da2970cd389937840bb/include/fastrtps/rtps/attributes/HistoryAttributes.h#L42
we are suppose to use a different memory allocation strategy everywhere but for some reason this default is being used for the RTPS endpoints

Reset fixed guard conditions

In ros2/rmw_connext#125, we found that the fixed guard conditions were never being reset to false, leading to rmw_wait() in many situations waking up immediately, every call. We fixed it in rmw_connext and rmw_opensplice by resetting the fixed guard conditions to false before existing rmw_wait().

For FastRTPS, it looks like you'll want to repeat this block for the fixed_guard_conditions. I would submit a PR with that fix, but I'm not familiar enough with the code to know that that's the right answer.

Error when publishing more than 5000 messages and using "keep all" history

Context:
I am leaving a publisher with default QoS settings to run, without any subscriptions. After publishing 5000 messages, I am getting a "Maximum number of allowed reserved caches reached" warning from Fast RTPS, which leads to a runtime error in rmw_fastrtps_cpp.

The default QoS profile at the time of opening this issue is:

static const rmw_qos_profile_t rmw_qos_profile_default =
{
  RMW_QOS_POLICY_KEEP_ALL_HISTORY,
  10,
  RMW_QOS_POLICY_RELIABLE,
  RMW_QOS_POLICY_DURABILITY_SYSTEM_DEFAULT
};

To replicate:
Run a publisher like this one (using rclcpp or rclpy) alone and let it run until it gets to 5000 messages. Increasing the frequency helps speed things up but this happens even at 1Hz.

Outcome:
This is the output with logging enabled on Fast RTPS:

Publishing: [Hello, world! 4999]
[PUBLISHER Info] Writing new data -> Function write
[RTPS_WRITER Info] Creating new change -> Function new_change
[RTPS_HISTORY Info] Change 5000 added with 23 bytes -> Function add_change
[RTPS_WRITER Info] No reader proxy to add change. -> Function unsent_change_added_to_history
Publishing: [Hello, world! 5000]
[PUBLISHER Info] Writing new data -> Function write
[RTPS_WRITER Info] Creating new change -> Function new_change
[RTPS_UTILS Info] Allocating group of cache changes of size: 510 -> Function allocateGroup
[RTPS_HISTORY Warning] Maximum number of allowed reserved caches reached -> Function allocateGroup
[RTPS_WRITER Warning] Problem reserving Cache from the History -> Function new_change
terminate called after throwing an instance of 'std::runtime_error'
  what():  failed to publish message: cannot publish data, at /home/dhood/ros2_ws1/src/ros2/rmw_fastrtps/rmw_fastrtps_cpp/src/functions.cpp:928, at /home/dhood/ros2_ws1/src/ros2/rcl/rcl/src/rcl/publisher.c:131
Aborted (core dumped)

Further details:

  1. When a subscription is subscribed to the topic, the error does not occur. When the subscription terminates, the error from the publisher will then happen 5000 messages later.
  2. Changing the QoS history setting to "keep last" will prevent this error from happening.
  3. Changing the QoS durability to volatile does not have any effect.
  4. Changing the QoS depth to a smaller (1) or larger (6000) number does not have any effect.
  5. Changing the QoS reliability to best effort does not have any effect.

refactor functions.cpp

This is a top level issue for the refactoring task. The idea is to refactor the _functions.cpp into multiple files, making the package easier to maintain.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.