real-logic / aeron Goto Github PK
View Code? Open in Web Editor NEWEfficient reliable UDP unicast, UDP multicast, and IPC message transport
License: Apache License 2.0
Efficient reliable UDP unicast, UDP multicast, and IPC message transport
License: Apache License 2.0
Steps to reproduce:
I believe the driver is doing everything that it should. In fact, it seems to send the second ON_OPERATION_SUCCESS response to the client. But the client does not unblock.
As pointed out by @mikeb01 a gather send would be nice. Multiple DirectBuffers on Publication.offer
would be useful. This would need to go all the way down to the LogAppender.
public void onMessage(final long messages, final long bytes)
{
totalBytes += bytes;
totalMessages += messages;
}
to:
public void onMessage(final long messages, final long bytes)
{
if ((ThreadLocalRandom.current().nextInt() % 100) < 10)
{
System.out.println("Good time for a nap...");
try
{
Thread.sleep(1000);
}
catch (InterruptedException e)
{
e.printStackTrace();
}
}
totalBytes += bytes;
totalMessages += messages;
}
and rebuild.
$ java -cp aeron-samples/build/libs/samples.jar uk.co.real_logic.aeron.driver.MediaDriver
$ java -cp aeron-samples/build/libs/samples.jar uk.co.real_logic.aeron.samples.RateSubscriber
$ java -cp aeron-samples/build/libs/samples.jar uk.co.real_logic.aeron.samples.StreamingPublisher
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x000000010414a229, pid=5354, tid=22019
#
# JRE version: Java(TM) SE Runtime Environment (8.0_31-b13) (build 1.8.0_31-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.31-b07 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# V [libjvm.dylib+0x54a229] Unsafe_GetIntVolatile+0x3e
#
# Core dump written. Default location: /cores/core or core.5354
#
# An error report file with more information is saved as:
# /Users/ericb/GitHub/Aeron/hs_err_pid5354.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#
Abort trap: 6 (core dumped)
jstack output for the core referred to above:
Attaching to core /cores/core.5354 from executable /Library/Java/JavaVirtualMachines/jdk1.8.0_31.jdk/Contents/Home/bin/java, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.31-b07
Deadlock Detection:
No deadlocks found.
Thread 3847: (state = BLOCKED)
Thread 22019: (state = IN_VM)
- sun.misc.Unsafe.getIntVolatile(java.lang.Object, long) @bci=0 (Interpreted frame)
- uk.co.real_logic.agrona.concurrent.UnsafeBuffer.getIntVolatile(int) @bci=20, line=425 (Interpreted frame)
- uk.co.real_logic.aeron.common.concurrent.logbuffer.FrameDescriptor.frameLengthVolatile(uk.co.real_logic.agrona.concurrent.UnsafeBuffer, int) @bci=5, line=262 (Interpreted frame)
- uk.co.real_logic.aeron.common.concurrent.logbuffer.LogReader.read(uk.co.real_logic.aeron.common.concurrent.logbuffer.DataHandler, int) @bci=42, line=90 (Interpreted frame)
- uk.co.real_logic.aeron.Connection.poll(int) @bci=58, line=92 (Interpreted frame)
- uk.co.real_logic.aeron.Subscription$$Lambda$14.apply(java.lang.Object, int) @bci=5 (Interpreted frame)
- uk.co.real_logic.agrona.concurrent.AtomicArray.doLimitedAction(int, int, uk.co.real_logic.agrona.concurrent.AtomicArray$ToIntLimitedFunction) @bci=58, line=185 (Compiled frame)
- uk.co.real_logic.aeron.Subscription.poll(int) @bci=40, line=89 (Compiled frame)
- uk.co.real_logic.aeron.samples.SamplesUtil.lambda$subscriberLoop$4(java.util.concurrent.atomic.AtomicBoolean, int, uk.co.real_logic.aeron.common.IdleStrategy, uk.co.real_logic.aeron.Subscription) @bci=9, line=91 (Interpreted frame)
- uk.co.real_logic.aeron.samples.SamplesUtil$$Lambda$13.accept(java.lang.Object) @bci=16 (Interpreted frame)
- uk.co.real_logic.aeron.samples.RateSubscriber.lambda$main$12(java.util.concurrent.atomic.AtomicBoolean, uk.co.real_logic.aeron.Subscription) @bci=8, line=70 (Interpreted frame)
- uk.co.real_logic.aeron.samples.RateSubscriber$$Lambda$12.run() @bci=8 (Interpreted frame)
- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1142 (Interpreted frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 (Interpreted frame)
- java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
Thread 14351: (state = BLOCKED)
Thread 13571: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.ref.ReferenceQueue.remove(long) @bci=44, line=142 (Interpreted frame)
- java.lang.ref.ReferenceQueue.remove() @bci=2, line=158 (Interpreted frame)
- java.lang.ref.Finalizer$FinalizerThread.run() @bci=36, line=209 (Interpreted frame)
Thread 13059: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=502 (Interpreted frame)
- java.lang.ref.Reference$ReferenceHandler.run() @bci=36, line=157 (Interpreted frame)
When Publication is well ahead of the DriverPublication, it is possible for the client to set the currently active log buffer that the driver is sending from to change status to NEEDS_CLEANING
.
This allows the driver conductor thread to clean the buffer out from underneath the sender thread and can lead to a livelock on the sender thread as it does a LogScanner.scanNext
and lock inside waitForFrameLength
as the frame length has been set to 0.
I got the following exception while running the StreamingPublisher and RateSubscriber with the publisher trying to run 100,000 messages.
:aeron-driver:run
Exception in thread "driver-conductor" java.lang.NullPointerException
at uk.co.real_logic.aeron.common.concurrent.AtomicBuffer.putString(AtomicBuffer.java:1021)
at uk.co.real_logic.aeron.common.concurrent.AtomicBuffer.putString(AtomicBuffer.java:1016)
at uk.co.real_logic.aeron.common.event.EventCodec.encode(EventCodec.java
:122)
at uk.co.real_logic.aeron.common.event.EventLogger.logException(EventLogger.java:142)
at uk.co.real_logic.aeron.driver.DriverConductor$$Lambda$31/87285178.accept(Unknown Source)
at uk.co.real_logic.aeron.common.Agent.run(Agent.java:59)
at java.lang.Thread.run(Thread.java:745)
Building 87% > :aeron-driver:runorg.junit.experimental.theories.internal.ParameterizedAssertionError: shouldContinueAfterBufferRolloverWithPadding(UNICAST_URI)
at org.junit.experimental.theories.Theories$TheoryAnchor.reportParameterizedError(Theories.java:192)
at org.junit.experimental.theories.Theories$TheoryAnchor$1$1.evaluate(Theories.java:146)
at org.junit.experimental.theories.Theories$TheoryAnchor.runWithCompleteAssignment(Theories.java:127)
at org.junit.experimental.theories.Theories$TheoryAnchor.runWithAssignment(Theories.java:111)
at org.junit.experimental.theories.Theories$TheoryAnchor.runWithIncompleteAssignment(Theories.java:120)
at org.junit.experimental.theories.Theories$TheoryAnchor.runWithAssignment(Theories.java:109)
at org.junit.experimental.theories.Theories$TheoryAnchor.evaluate(Theories.java:96)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:86)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:49)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:69)
at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:50)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.messaging.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at org.gradle.messaging.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:105)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at org.gradle.messaging.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.messaging.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:355)
at org.gradle.internal.concurrent.DefaultExecutorFactory$StoppableExecutorImpl$1.run(DefaultExecutorFactory.java:64)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: uk.co.real_logic.aeron.exceptions.RegistrationException: Values aren't equal: 16777216 and 65536
at uk.co.real_logic.aeron.conductor.ClientConductor.onError(ClientConductor.java:242)
at uk.co.real_logic.aeron.conductor.DriverBroadcastReceiver.onError(DriverBroadcastReceiver.java:119)
at uk.co.real_logic.aeron.conductor.DriverBroadcastReceiver.lambda$receive$1(DriverBroadcastReceiver.java:88)
at uk.co.real_logic.aeron.conductor.DriverBroadcastReceiver$$Lambda$44/1098583073.onMessage(Unknown Source)
at uk.co.real_logic.aeron.common.concurrent.broadcast.CopyBroadcastReceiver.receive(CopyBroadcastReceiver.java:60)
at uk.co.real_logic.aeron.conductor.DriverBroadcastReceiver.receive(DriverBroadcastReceiver.java:51)
at uk.co.real_logic.aeron.conductor.ClientConductor.doWork(ClientConductor.java:102)
at uk.co.real_logic.aeron.common.Agent.run(Agent.java:55)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more
Afterwards the StreamingPublisher died with the following exception:
c:\deve\Aeron>java -cp aeron-examples/build/libs/examples.jar uk.co.real_logic.a
eron.examples.StreamingPublisher
Streaming 100000 messages of size 256 bytes to udp://localhost:40123 on stream Id 10
Exception in thread "main" uk.co.real_logic.aeron.exceptions.MediaDriverTimeoutException: No response from media driver within 10000 ms
at uk.co.real_logic.aeron.conductor.ClientConductor.checkMediaDriverTimeout(ClientConductor.java:286)
at uk.co.real_logic.aeron.conductor.ClientConductor.await(ClientConductor.java:255)
at uk.co.real_logic.aeron.conductor.ClientConductor.addPublication(ClientConductor.java:121)
at uk.co.real_logic.aeron.Aeron.addPublication(Aeron.java:165)
at uk.co.real_logic.aeron.examples.StreamingPublisher.main(StreamingPublisher.java:55)
Running the following commands on RHEL 6.5
/usr/java/jdk1.8.0_05-x86_64/bin/java -cp examples.jar -Daeron.agent.idle.strategy=uk.co.real_logic.aeron.common.BusySpinIdleStrategy uk.co.real_logic.aeron.driver.MediaDriver
/usr/java/jdk1.8.0_05-x86_64/bin/java -cp examples.jar uk.co.real_logic.aeron.examples.Pong
/usr/java/jdk1.8.0_05-x86_64/bin/java -cp examples.jar -Daeron.example.numberOfMessages=100000 -Daeron.example.messageLength=40 uk.co.real_logic.aeron.examples.Ping
#[Mean = 21.507, StdDeviation = 365.060]
#[Max = 91553.792, Total count = 100000]
#[Buckets = 19, SubBuckets = 2048]
Then re-running the same Ping command only without shutting down Pong causes MediaDriver to report the following errors, which keep streaming for the duration of the run:
[2675861.232913] FLOW_CONTROL_OVERRUN [47/47]: overrun 6d6c00 > 0 + 131072
[2675861.432921] FLOW_CONTROL_OVERRUN [47/47]: overrun 6d6c00 > 0 + 131072
[2675861.632927] FLOW_CONTROL_OVERRUN [47/47]: overrun 6d6c00 > 0 + 131072
[2675861.832935] FLOW_CONTROL_OVERRUN [47/47]: overrun 6d6c00 > 0 + 131072
[2675862.032944] FLOW_CONTROL_OVERRUN [47/47]: overrun 6d6c00 > 0 + 131072
[2675862.232951] FLOW_CONTROL_OVERRUN [47/47]: overrun 6d6c00 > 0 + 131072
[2675862.432960] FLOW_CONTROL_OVERRUN [47/47]: overrun 6d6c00 > 0 + 131072
[2675862.632966] FLOW_CONTROL_OVERRUN [47/47]: overrun 6d6c00 > 0 + 131072
At the end of injection MediaDriver reports:
[2675995.773666] ERROR_DELETING_FILE [183/183]: Unable to delete /tmp/aeron/data/subscriptions/UDP-00000000-0-7f000001-40124/0/10/0-log or /tmp/aeron/data/subscriptions/UDP-00000000-0-7f000001-40124/0/10/0-state
[2675995.776088] ERROR_DELETING_FILE [183/183]: Unable to delete /tmp/aeron/data/subscriptions/UDP-00000000-0-7f000001-40124/0/10/1-log or /tmp/aeron/data/subscriptions/UDP-00000000-0-7f000001-40124/0/10/1-state
[2675995.778612] ERROR_DELETING_FILE [183/183]: Unable to delete /tmp/aeron/data/subscriptions/UDP-00000000-0-7f000001-40124/0/10/2-log or /tmp/aeron/data/subscriptions/UDP-00000000-0-7f000001-40124/0/10/2-state
and Ping:
#[Mean = NaN, StdDeviation = NaN]
#[Max = 0.000, Total count = 0]
#[Buckets = 19, SubBuckets = 2048]
A C version of the driver could use platform dependent optimizations for networking and access to specific CPU instructions. Specifically for Linux on x86:
epoll
sendmmsg
& recvmmsg
PAUSE
instruction on x86Much of the data structures / log used internally can be reused elsewhere. Perhaps you ccan pring them out as a separate project.
Interface IP address is not enough for systems use. The interface needs to be able to take a netmask as well and do a search.
NOTE: this will most likely need to expand the URI scheme for UDP.
Start the driver:
$ java -cp aeron-samples/build/libs/samples.jar -Dagrona.disable.bounds.checks=true uk.co.real_logic.aeron.samples.LowLatencyMediaDriver
Start the first receiver, receiver A:
$ java -cp aeron-samples/build/libs/samples.jar -Dagrona.disable.bounds.checks=true -Daeron.sample.frameCountLimit=256 uk.co.real_logic.aeron.samples.RateSubscriber
Start a second receiver, receiver B:
$ java -cp aeron-samples/build/libs/samples.jar -Dagrona.disable.bounds.checks=true -Daeron.sample.frameCountLimit=256 uk.co.real_logic.aeron.samples.RateSubscriber
Start a publisher:
$ java -cp aeron-samples/build/libs/samples.jar -Daeron.sample.messageLength=40 -Daeron.sample.messages=500000000 -Dagrona.disable.bounds.checks=true uk.co.real_logic.aeron.samples.StreamingPublisher
Now, suspend receiver B with control-z. After a second or two, the sending and receiving rate will drop to 0 on the source and on receiver A. Wait a few more seconds, and the send rate will return and messages will flow again. Now unsuspend receiver B by an ‘fg’, and it should start receiving messages again. Now kill receiver B with control-c.
At this point, sending and receiving appears to stop altogether. Source reports no messages sent, receiver A (up the whole time) does not receive any further messages. That’s not good…
Now kill receiver A with a control-c as well. Uh oh - it now reports this:
Shutting down...
Exception in thread "main" uk.co.real_logic.aeron.exceptions.RegistrationException: Could not find stream Id to decrement: 10
at uk.co.real_logic.aeron.ClientConductor.onError(ClientConductor.java:257)
at uk.co.real_logic.aeron.DriverListenerAdapter.onError(DriverListenerAdapter.java:135)
at uk.co.real_logic.aeron.DriverListenerAdapter.onMessage(DriverListenerAdapter.java:120)
at uk.co.real_logic.agrona.concurrent.broadcast.CopyBroadcastReceiver.receive(CopyBroadcastReceiver.java:84)
at uk.co.real_logic.aeron.DriverListenerAdapter.receiveMessages(DriverListenerAdapter.java:56)
at uk.co.real_logic.aeron.ClientConductor.doWork(ClientConductor.java:104)
at uk.co.real_logic.aeron.common.AgentRunner.run(AgentRunner.java:85)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
and hangs forever with this stack:
2015-02-24 14:03:02
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.31-b07 mixed mode):
"Attach Listener" #14 daemon prio=9 os_prio=31 tid=0x00007fa1f505c800 nid=0x380f waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"DestroyJavaVM" #13 prio=5 os_prio=31 tid=0x00007fa1f4002000 nid=0xe07 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"pool-1-thread-2" #11 prio=5 os_prio=31 tid=0x00007fa1f3820000 nid=0x5703 waiting on condition [0x0000000136305000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076abb9ff8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"pool-1-thread-1" #10 prio=5 os_prio=31 tid=0x00007fa1f381f800 nid=0x5503 waiting on condition [0x0000000136202000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076abb9ff8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"Service Thread" #9 daemon prio=9 os_prio=31 tid=0x00007fa1f5024800 nid=0x5103 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread3" #8 daemon prio=9 os_prio=31 tid=0x00007fa1f3023000 nid=0x4f03 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fa1f3022800 nid=0x4d03 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" #6 daemon prio=9 os_prio=31 tid=0x00007fa1f3021800 nid=0x4b03 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=31 tid=0x00007fa1f4001000 nid=0x4903 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=31 tid=0x00007fa1f283e000 nid=0x3c17 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=31 tid=0x00007fa1f5809000 nid=0x3503 in Object.wait() [0x000000012bf2d000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000076ab062f8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142)
- locked <0x000000076ab062f8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=31 tid=0x00007fa1f5808000 nid=0x3303 in Object.wait() [0x000000012be2a000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000076ab05d68> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:157)
- locked <0x000000076ab05d68> (a java.lang.ref.Reference$Lock)
"VM Thread" os_prio=31 tid=0x00007fa1f5805800 nid=0x3103 runnable
"GC task thread#0 (ParallelGC)" os_prio=31 tid=0x00007fa1f280e000 nid=0x140b runnable
"GC task thread#1 (ParallelGC)" os_prio=31 tid=0x00007fa1f281d000 nid=0x2313 runnable
"GC task thread#2 (ParallelGC)" os_prio=31 tid=0x00007fa1f281d800 nid=0x2503 runnable
"GC task thread#3 (ParallelGC)" os_prio=31 tid=0x00007fa1f281e800 nid=0x2703 runnable
"GC task thread#4 (ParallelGC)" os_prio=31 tid=0x00007fa1f281f000 nid=0x2903 runnable
"GC task thread#5 (ParallelGC)" os_prio=31 tid=0x00007fa1f281f800 nid=0x2b03 runnable
"GC task thread#6 (ParallelGC)" os_prio=31 tid=0x00007fa1f2820000 nid=0x2d03 runnable
"GC task thread#7 (ParallelGC)" os_prio=31 tid=0x00007fa1f2821000 nid=0x2f03 runnable
"VM Periodic Task Thread" os_prio=31 tid=0x00007fa1f5025800 nid=0x5303 waiting on condition
JNI global references: 326
Meanwhile, at the same time that receiver A was control-c’d, the driver reports this:
[44947.428765] EXCEPTION [591/591]: java.lang.IllegalStateException(Could not find stream Id to decrement: 10)
uk.co.real_logic.aeron.driver.ReceiveChannelEndpoint.decRefToStream ReceiveChannelEndpoint.java:114
uk.co.real_logic.aeron.driver.DriverConductor.onRemoveSubscription DriverConductor.java:513
uk.co.real_logic.aeron.driver.DriverConductor.onClientCommand DriverConductor.java:306
uk.co.real_logic.aeron.driver.DriverConductor$$Lambda$38/1327763628.onMessage null:-1
uk.co.real_logic.agrona.concurrent.ringbuffer.ManyToOneRingBuffer.read ManyToOneRingBuffer.java:144
and is using up an awful lot of CPU; here’s a stack sample of the driver after this exception while it’s burning CPU:
015-02-24 14:04:41
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.31-b07 mixed mode):
"Attach Listener" #13 daemon prio=9 os_prio=31 tid=0x00007fea39a8c000 nid=0x3a0b waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"driver-conductor" #12 prio=5 os_prio=31 tid=0x00007fea39adb800 nid=0x5903 runnable [0x00000001290a6000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.DatagramChannelImpl.receive0(Native Method)
at sun.nio.ch.DatagramChannelImpl.receiveIntoNativeBuffer(DatagramChannelImpl.java:429)
at sun.nio.ch.DatagramChannelImpl.receive(DatagramChannelImpl.java:407)
at sun.nio.ch.DatagramChannelImpl.receive(DatagramChannelImpl.java:360)
- locked <0x000000076b5ed2e8> (a java.lang.Object)
at uk.co.real_logic.aeron.driver.UdpChannelTransport.receive(UdpChannelTransport.java:294)
at uk.co.real_logic.aeron.driver.UdpChannelTransport.pollFrames(UdpChannelTransport.java:241)
at uk.co.real_logic.aeron.driver.TransportPoller.pollTransports(TransportPoller.java:146)
at uk.co.real_logic.aeron.driver.DriverConductor.doWork(DriverConductor.java:201)
at uk.co.real_logic.aeron.common.AgentRunner.run(AgentRunner.java:85)
at java.lang.Thread.run(Thread.java:745)
"receiver" #11 prio=5 os_prio=31 tid=0x00007fea39a61000 nid=0x5703 runnable [0x0000000128fa3000]
java.lang.Thread.State: RUNNABLE
at uk.co.real_logic.aeron.common.AgentRunner.run(AgentRunner.java:85)
at java.lang.Thread.run(Thread.java:745)
"sender" #10 prio=5 os_prio=31 tid=0x00007fea39a60800 nid=0x5503 runnable [0x0000000128ea0000]
java.lang.Thread.State: RUNNABLE
at uk.co.real_logic.aeron.driver.Sender.doSend(Sender.java:123)
at uk.co.real_logic.aeron.driver.Sender.doWork(Sender.java:50)
at uk.co.real_logic.aeron.common.AgentRunner.run(AgentRunner.java:85)
at java.lang.Thread.run(Thread.java:745)
"Service Thread" #9 daemon prio=9 os_prio=31 tid=0x00007fea3a805000 nid=0x5103 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread3" #8 daemon prio=9 os_prio=31 tid=0x00007fea3b80b000 nid=0x4f03 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fea3b80a800 nid=0x4d03 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" #6 daemon prio=9 os_prio=31 tid=0x00007fea3b809800 nid=0x4b03 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=31 tid=0x00007fea3988e800 nid=0x4903 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=31 tid=0x00007fea3988e000 nid=0x3d0b runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=31 tid=0x00007fea3986f800 nid=0x3503 in Object.wait() [0x000000011e8f0000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000076ab062f8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:142)
- locked <0x000000076ab062f8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:158)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=31 tid=0x00007fea3986f000 nid=0x3303 in Object.wait() [0x000000011e7ed000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000076ab05d68> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:157)
- locked <0x000000076ab05d68> (a java.lang.ref.Reference$Lock)
"main" #1 prio=5 os_prio=31 tid=0x00007fea39812800 nid=0xe07 waiting on condition [0x0000000100da0000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at uk.co.real_logic.aeron.common.concurrent.SigIntBarrier.await(SigIntBarrier.java:54)
at uk.co.real_logic.aeron.samples.LowLatencyMediaDriver.main(LowLatencyMediaDriver.java:39)
"VM Thread" os_prio=31 tid=0x00007fea3986c000 nid=0x3103 runnable
"GC task thread#0 (ParallelGC)" os_prio=31 tid=0x00007fea3981e000 nid=0x150b runnable
"GC task thread#1 (ParallelGC)" os_prio=31 tid=0x00007fea3a800000 nid=0x1407 runnable
"GC task thread#2 (ParallelGC)" os_prio=31 tid=0x00007fea3a003000 nid=0x2503 runnable
"GC task thread#3 (ParallelGC)" os_prio=31 tid=0x00007fea3a801000 nid=0x2703 runnable
"GC task thread#4 (ParallelGC)" os_prio=31 tid=0x00007fea3a004000 nid=0x2903 runnable
"GC task thread#5 (ParallelGC)" os_prio=31 tid=0x00007fea3a801800 nid=0x2b03 runnable
"GC task thread#6 (ParallelGC)" os_prio=31 tid=0x00007fea3a802000 nid=0x2d03 runnable
"GC task thread#7 (ParallelGC)" os_prio=31 tid=0x00007fea3981e800 nid=0x2f03 runnable
"VM Periodic Task Thread" os_prio=31 tid=0x00007fea3a83e000 nid=0x5303 waiting on condition
JNI global references: 473
At this point, starting up a new receiver does allow messages to be sent (still from the original source, which has been up the whole time) and received again by the brand new receiver.
when I run on RH 7 (with IPv6 default), running BasicPublisher on a multicast address results in
[1205304.340498] EXCEPTION [560/560]: java.lang.RuntimeException(java.net.SocketException: Network is unreachable)
uk.co.real_logic.aeron.driver.UdpChannelTransport.sendTo UdpChannelTransport.java:159
uk.co.real_logic.aeron.driver.SendChannelEndpoint.sendTo SendChannelEndpoint.java:61
uk.co.real_logic.aeron.driver.DriverPublication.sendSetupFrame DriverPublication.java:352
uk.co.real_logic.aeron.driver.DriverPublication.setupFrameCheck DriverPublication.java:365
uk.co.real_logic.aeron.driver.DriverPublication.send DriverPublication.java:170
[1205304.340578] EXCEPTION [560/560]: java.lang.RuntimeException(java.net.SocketException: Network is unreachable)
uk.co.real_logic.aeron.driver.UdpChannelTransport.sendTo UdpChannelTransport.java:159
uk.co.real_logic.aeron.driver.SendChannelEndpoint.sendTo SendChannelEndpoint.java:61
uk.co.real_logic.aeron.driver.DriverPublication.sendSetupFrame DriverPublication.java:352
uk.co.real_logic.aeron.driver.DriverPublication.setupFrameCheck DriverPublication.java:365
uk.co.real_logic.aeron.driver.DriverPublication.send DriverPublication.java:170
Its required to set
System.setProperty("java.net.preferIPv4Stack","true" );
in order to make things work. You might consider to add this by default ..
Re: #28
Many thanks -- that does fix the C++ build.
Here's a snip from Java build that is failing:
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldOnlyRemoveSubscriptionMediaEndpointUponRemovalOfAllSubscribers FAILED
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertNotNull(Assert.java:621)
at org.junit.Assert.assertNotNull(Assert.java:631)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldOnlyRemoveSubscriptionMediaEndpointUponRemovalOfAllSubscribers(DriverConductorTest.java:287)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldNotTimeoutSubscriptionOnKeepAlive FAILED
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertNotNull(Assert.java:621)
at org.junit.Assert.assertNotNull(Assert.java:631)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldNotTimeoutSubscriptionOnKeepAlive(DriverConductorTest.java:409)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldKeepSubscriptionMediaEndpointUponRemovalOfAllButOneSubscriber FAILED
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertNotNull(Assert.java:621)
at org.junit.Assert.assertNotNull(Assert.java:631)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldKeepSubscriptionMediaEndpointUponRemovalOfAllButOneSubscriber(DriverConductorTest.java:262)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldErrorOnRemoveChannelOnUnknownSessionId FAILED
Wanted but not invoked:
senderProxy.newPublication(
);
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.verifySenderNotifiedOfNewPublication(DriverConductorTest.java:471)
Actually, there were zero interactions with this mock.
at uk.co.real_logic.aeron.driver.DriverConductorTest.verifySenderNotifiedOfNewPublication(DriverConductorTest.java:471)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldErrorOnRemoveChannelOnUnknownSessionId(DriverConductorTest.java:313)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldBeAbleToRemoveMultipleStreams FAILED
Wanted but not invoked:
senderProxy.closePublication();
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToRemoveMultipleStreams(DriverConductorTest.java:237)
Actually, there were zero interactions with this mock.
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToRemoveMultipleStreams(DriverConductorTest.java:237)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldErrorOnRemoveChannelOnUnknownStreamId FAILED
org.mockito.exceptions.verification.TooManyActualInvocations:
eventLogger.logException();
Wanted 1 time:
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.verifyExceptionLogged(DriverConductorTest.java:443)
But was 2 times. Undesired invocation:
-> at uk.co.real_logic.aeron.driver.DriverConductor.onClientCommand(DriverConductor.java:324)
at uk.co.real_logic.aeron.driver.DriverConductorTest.verifyExceptionLogged(DriverConductorTest.java:443)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldErrorOnRemoveChannelOnUnknownStreamId(DriverConductorTest.java:331)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldTimeoutSubscription FAILED
Wanted but not invoked:
receiverProxy.addSubscription(, 10);
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.verifyReceiverSubscribes(DriverConductorTest.java:433)
Actually, there were zero interactions with this mock.
at uk.co.real_logic.aeron.driver.DriverConductorTest.verifyReceiverSubscribes(DriverConductorTest.java:433)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldTimeoutSubscription(DriverConductorTest.java:393)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldNotTimeoutPublicationOnKeepAlive FAILED
Wanted but not invoked:
senderProxy.newPublication(
);
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.verifySenderNotifiedOfNewPublication(DriverConductorTest.java:471)
Actually, there were zero interactions with this mock.
at uk.co.real_logic.aeron.driver.DriverConductorTest.verifySenderNotifiedOfNewPublication(DriverConductorTest.java:471)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldNotTimeoutPublicationOnKeepAlive(DriverConductorTest.java:371)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldBeAbleToAddMultipleStreams FAILED
Wanted but not invoked:
senderProxy.newPublication();
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToAddMultipleStreams(DriverConductorTest.java:203)
Actually, there were zero interactions with this mock.
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToAddMultipleStreams(DriverConductorTest.java:203)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldBeAbleToAddSingleSubscription FAILED
Wanted but not invoked:
clientProxy.operationSucceeded(1429);
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToAddSingleSubscription(DriverConductorTest.java:177)
However, there were other interactions with this mock:
-> at uk.co.real_logic.aeron.driver.DriverConductor.onClientCommand(DriverConductor.java:333)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToAddSingleSubscription(DriverConductorTest.java:177)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldTimeoutPublication FAILED
Wanted but not invoked:
senderProxy.newPublication(
);
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.verifySenderNotifiedOfNewPublication(DriverConductorTest.java:471)
Actually, there were zero interactions with this mock.
at uk.co.real_logic.aeron.driver.DriverConductorTest.verifySenderNotifiedOfNewPublication(DriverConductorTest.java:471)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldTimeoutPublication(DriverConductorTest.java:356)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldBeAbleToRemoveSingleStream FAILED
Wanted but not invoked:
senderProxy.closePublication();
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToRemoveSingleStream(DriverConductorTest.java:216)
Actually, there were zero interactions with this mock.
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToRemoveSingleStream(DriverConductorTest.java:216)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldBeAbleToAddSinglePublication FAILED
Wanted but not invoked:
senderProxy.newPublication(
);
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.verifySenderNotifiedOfNewPublication(DriverConductorTest.java:471)
Actually, there were zero interactions with this mock.
at uk.co.real_logic.aeron.driver.DriverConductorTest.verifySenderNotifiedOfNewPublication(DriverConductorTest.java:471)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToAddSinglePublication(DriverConductorTest.java:163)
uk.co.real_logic.aeron.driver.UdpChannelTest > shouldHandleCanonicalFormForMulticastCorrectly SKIPPED
uk.co.real_logic.aeron.driver.UdpChannelTest > shouldParseValidMulticastAddress SKIPPED
uk.co.real_logic.aeron.driver.SelectorAndTransportTest > shouldSendMultipleDataFramesPerDatagramUnicastFromSourceToReceiver FAILED
java.lang.IllegalStateException: Failed to set SO_RCVBUF: attempted=131072, actual=124928
at uk.co.real_logic.aeron.driver.UdpChannelTransport.(UdpChannelTransport.java:102)
at uk.co.real_logic.aeron.driver.ReceiverUdpChannelTransport.(ReceiverUdpChannelTransport.java:58)
at uk.co.real_logic.aeron.driver.SelectorAndTransportTest.shouldSendMultipleDataFramesPerDatagramUnicastFromSourceToReceiver(SelectorAndTransportTest.java:185)
uk.co.real_logic.aeron.driver.SelectorAndTransportTest > shouldSendEmptyDataFrameUnicastFromSourceToReceiver FAILED
java.lang.RuntimeException: channel "udp://localhost:40123" : java.net.BindException: Address already in use
at uk.co.real_logic.aeron.driver.UdpChannelTransport.(UdpChannelTransport.java:112)
at uk.co.real_logic.aeron.driver.ReceiverUdpChannelTransport.(ReceiverUdpChannelTransport.java:58)
at uk.co.real_logic.aeron.driver.SelectorAndTransportTest.shouldSendEmptyDataFrameUnicastFromSourceToReceiver(SelectorAndTransportTest.java:135)
Caused by:
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.DatagramChannelImpl.bind(DatagramChannelImpl.java:706)
at uk.co.real_logic.aeron.driver.UdpChannelTransport.<init>(UdpChannelTransport.java:77)
... 2 more
uk.co.real_logic.aeron.driver.SelectorAndTransportTest > shouldHandleSmFrameFromReceiverToSender FAILED
java.lang.RuntimeException: channel "udp://localhost:40123" : java.net.BindException: Address already in use
at uk.co.real_logic.aeron.driver.UdpChannelTransport.(UdpChannelTransport.java:112)
at uk.co.real_logic.aeron.driver.ReceiverUdpChannelTransport.(ReceiverUdpChannelTransport.java:58)
at uk.co.real_logic.aeron.driver.SelectorAndTransportTest.shouldHandleSmFrameFromReceiverToSender(SelectorAndTransportTest.java:236)
Caused by:
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.DatagramChannelImpl.bind(DatagramChannelImpl.java:706)
at uk.co.real_logic.aeron.driver.UdpChannelTransport.<init>(UdpChannelTransport.java:77)
... 2 more
uk.co.real_logic.aeron.driver.SelectorAndTransportTest > shouldHandleBasicSetupAndTeardown FAILED
java.lang.RuntimeException: channel "udp://localhost:40123" : java.net.BindException: Address already in use
at uk.co.real_logic.aeron.driver.UdpChannelTransport.(UdpChannelTransport.java:112)
at uk.co.real_logic.aeron.driver.ReceiverUdpChannelTransport.(ReceiverUdpChannelTransport.java:58)
at uk.co.real_logic.aeron.driver.SelectorAndTransportTest.shouldHandleBasicSetupAndTeardown(SelectorAndTransportTest.java:103)
Caused by:
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.DatagramChannelImpl.bind(DatagramChannelImpl.java:706)
at uk.co.real_logic.aeron.driver.UdpChannelTransport.<init>(UdpChannelTransport.java:77)
... 2 more
uk.co.real_logic.aeron.driver.ReceiverTest > shouldNotOverwriteDataFrameWithHeartbeat FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.DatagramChannelImpl.bind(DatagramChannelImpl.java:706)
at uk.co.real_logic.aeron.driver.ReceiverTest.setUp(ReceiverTest.java:145)
java.lang.NullPointerException
at uk.co.real_logic.aeron.driver.ReceiverTest.tearDown(ReceiverTest.java:160)
uk.co.real_logic.aeron.driver.ReceiverTest > shouldOverwriteHeartbeatWithDataFrame FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.DatagramChannelImpl.bind(DatagramChannelImpl.java:706)
at uk.co.real_logic.aeron.driver.ReceiverTest.setUp(ReceiverTest.java:145)
java.lang.NullPointerException
at uk.co.real_logic.aeron.driver.ReceiverTest.tearDown(ReceiverTest.java:160)
uk.co.real_logic.aeron.driver.ReceiverTest > shouldHandleNonZeroTermOffsetCorrectly FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.DatagramChannelImpl.bind(DatagramChannelImpl.java:706)
at uk.co.real_logic.aeron.driver.ReceiverTest.setUp(ReceiverTest.java:145)
java.lang.NullPointerException
at uk.co.real_logic.aeron.driver.ReceiverTest.tearDown(ReceiverTest.java:160)
uk.co.real_logic.aeron.driver.ReceiverTest > shouldInsertDataIntoLogAfterInitialExchange FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.DatagramChannelImpl.bind(DatagramChannelImpl.java:706)
at uk.co.real_logic.aeron.driver.ReceiverTest.setUp(ReceiverTest.java:145)
java.lang.NullPointerException
at uk.co.real_logic.aeron.driver.ReceiverTest.tearDown(ReceiverTest.java:160)
uk.co.real_logic.aeron.driver.ReceiverTest > shouldCreateRcvTermAndSendSmOnSetup FAILED
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:436)
at sun.nio.ch.DatagramChannelImpl.bind(DatagramChannelImpl.java:706)
at uk.co.real_logic.aeron.driver.ReceiverTest.setUp(ReceiverTest.java:145)
java.lang.NullPointerException
at uk.co.real_logic.aeron.driver.ReceiverTest.tearDown(ReceiverTest.java:160)
Results: FAILURE (77 tests, 53 successes, 22 failures, 2 skipped)
77 tests completed, 22 failed, 2 skipped
:aeron-driver:test FAILED
FAILURE: Build failed with an exception.
What went wrong:
Execution failed for task ':aeron-driver:test'.
There were failing tests. See the report at: file:///shared/Aeron/aeron-driver/build/reports/tests/index.html
Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.
BUILD FAILED
Total time: 19.425 secs
Each Aeron frame seems like it takes 24 bytes odd. I have tiny messages - most of them are 25 bytes odd. To not waste too much space on control overhead, I want to batch multiple application level messages and send them as part of a single Aeron message. Is there an easy way to do this.
A slight complication I have is that I have multiple (threads) message producers. My current plan is to have each thread claim a small buffer (say for 10 or 20 messages) from Aeron using the BufferClaim interface. After claiming some space from the buffer, each thread can hold onto it's BufferClaim object and attempt to fill it up with messages that it's producing. Once it's filled up, it can then commit it's claim and then make another claim using the same BufferClaim object. Does that seem like something that could work?
It's me again ;-) Still trying to get something running on CentOS, but C++ build is failing:
[/shared/Aeron/cppbuild] CC=gcc CXX=g++ ./cppbuild
-- The C compiler identification is GNU 4.9.0
-- The CXX compiler identification is GNU 4.9.0
-- Check for working C compiler: /build/share/gcc/4.9.0/bin/gcc
-- Check for working C compiler: /build/share/gcc/4.9.0/bin/gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /build/share/gcc/4.9.0/bin/g++
-- Check for working CXX compiler: /build/share/gcc/4.9.0/bin/g++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Found PythonInterp: /usr/bin/python (found version "2.6.6")
-- Looking for include file pthread.h
-- Looking for include file pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /shared/Aeron/cppbuild
Scanning dependencies of target gmock
[ 3%] Building CXX object aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock.dir/gtest/src/gtest-all.cc.o
[ 6%] Building CXX object aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock.dir/src/gmock-all.cc.o
Linking CXX static library ../../../../../../lib/libgmock.a
[ 6%] Built target gmock
Scanning dependencies of target gmock_main
[ 10%] Building CXX object aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock_main.dir/gtest/src/gtest-all.cc.o
[ 13%] Building CXX object aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock_main.dir/src/gmock-all.cc.o
[ 17%] Building CXX object aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock_main.dir/src/gmock_main.cc.o
Linking CXX static library ../../../../../../lib/libgmock_main.a
[ 17%] Built target gmock_main
Scanning dependencies of target gtest
[ 20%] Building CXX object aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest.dir/src/gtest-all.cc.o
Linking CXX static library ../../../../../../../lib/libgtest.a
[ 20%] Built target gtest
Scanning dependencies of target gtest_main
[ 24%] Building CXX object aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest_main.dir/src/gtest_main.cc.o
Linking CXX static library ../../../../../../../lib/libgtest_main.a
[ 24%] Built target gtest_main
Scanning dependencies of target aeron_util
[ 27%] Building CXX object aeron-common/src/main/cpp/util/CMakeFiles/aeron_util.dir/MemoryMappedFile.cpp.o
[ 31%] Building CXX object aeron-common/src/main/cpp/util/CMakeFiles/aeron_util.dir/CommandOption.cpp.o
[ 34%] Building CXX object aeron-common/src/main/cpp/util/CMakeFiles/aeron_util.dir/CommandOptionParser.cpp.o
[ 37%] Building CXX object aeron-common/src/main/cpp/util/CMakeFiles/aeron_util.dir/StringUtil.cpp.o
Linking CXX static library ../../../../../lib/libaeron_util.a
[ 37%] Built target aeron_util
Scanning dependencies of target utilTests
[ 41%] Building CXX object aeron-common/src/main/cpp/util/CMakeFiles/utilTests.dir/tests/testUtil.cpp.o
[ 44%] Building CXX object aeron-common/src/main/cpp/util/CMakeFiles/utilTests.dir/tests/testMemoryMappedFile.cpp.o
Linking CXX executable ../../../../../binaries/utilTests
CMakeFiles/utilTests.dir/tests/testMemoryMappedFile.cpp.o: In function `makeTempFileName()':
testMemoryMappedFile.cpp:(.text+0xbc): warning: the use of `tempnam' is dangerous, better use `mkstemp'
[ 44%] Built target utilTests
Scanning dependencies of target aeron_concurrent
[ 48%] Building CXX object aeron-common/src/main/cpp/concurrent/CMakeFiles/aeron_concurrent.dir/AtomicCounter.cpp.o
[ 51%] Building CXX object aeron-common/src/main/cpp/concurrent/CMakeFiles/aeron_concurrent.dir/CountersManager.cpp.o
Linking CXX static library ../../../../../lib/libaeron_concurrent.a
[ 51%] Built target aeron_concurrent
Scanning dependencies of target concurrentTests
[ 55%] Building CXX object aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/tests/testConcurrent.cpp.o
[ 58%] Building CXX object aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/tests/testCountersManager.cpp.o
[ 62%] Building CXX object aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/tests/testManyToOneRingBuffer.cpp.o
[ 65%] Building CXX object aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/tests/testLogAppender.cpp.o
In file included from /shared/Aeron/aeron-common/src/main/cpp/concurrent/tests/testLogAppender.cpp:19:0:
/shared/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In instantiation of \u2018testing::AssertionResult testing::internal::CmpHelperEQ(const char*, const char*, const T1&, const T2&) [with T1 = int; T2 = long unsigned int]\u2019:
/shared/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1485:30: required from \u2018static testing::AssertionResult testing::internal::EqHelper<lhs_is_null_literal>::Compare(const char*, const char*, const T1&, const T2&) [with T1 = int; T2 = long unsigned int; bool lhs_is_null_literal = false]\u2019
/shared/Aeron/aeron-common/src/main/cpp/concurrent/tests/testLogAppender.cpp:451:5: required from here
/shared/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1448:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (expected == actual) {
^
[ 68%] Building CXX object aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/tests/testLogReader.cpp.o
[ 72%] Building CXX object aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/tests/testBroadcastTransmitter.cpp.o
[ 75%] Building CXX object aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/tests/testBroadcastReceiver.cpp.o
Linking CXX executable ../../../../../binaries/concurrentTests
[ 75%] Built target concurrentTests
Scanning dependencies of target commandTests
[ 79%] Building CXX object aeron-common/src/main/cpp/command/CMakeFiles/commandTests.dir/tests/testCommand.cpp.o
In file included from /shared/Aeron/aeron-common/src/main/cpp/command/tests/testCommand.cpp:18:0:
/shared/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In instantiation of \u2018testing::AssertionResult testing::internal::CmpHelperEQ(const char*, const char*, const T1&, const T2&) [with T1 = int; T2 = long unsigned int]\u2019:
/shared/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1485:30: required from \u2018static testing::AssertionResult testing::internal::EqHelper<lhs_is_null_literal>::Compare(const char*, const char*, const T1&, const T2&) [with T1 = int; T2 = long unsigned int; bool lhs_is_null_literal = false]\u2019
/shared/Aeron/aeron-common/src/main/cpp/command/tests/testCommand.cpp:81:5: required from here
/shared/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1448:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (expected == actual) {
^
Linking CXX executable ../../../../../binaries/commandTests
[ 79%] Built target commandTests
Scanning dependencies of target AeronStat
[ 82%] Building CXX object aeron-samples/src/main/cpp/CMakeFiles/AeronStat.dir/AeronStat.cpp.o
Linking CXX executable ../../../../binaries/AeronStat
[ 82%] Built target AeronStat
Scanning dependencies of target aeron_client
[ 86%] Building CXX object aeron-client/src/main/cpp/CMakeFiles/aeron_client.dir/Publication.cpp.o
[ 89%] Building CXX object aeron-client/src/main/cpp/CMakeFiles/aeron_client.dir/Subscription.cpp.o
[ 93%] Building CXX object aeron-client/src/main/cpp/CMakeFiles/aeron_client.dir/ClientConductor.cpp.o
[ 96%] Building CXX object aeron-client/src/main/cpp/CMakeFiles/aeron_client.dir/Aeron.cpp.o
Linking CXX static library ../../../../lib/libaeron_client.a
[ 96%] Built target aeron_client
Scanning dependencies of target BasicPublisher
[100%] Building CXX object aeron-samples/src/main/cpp/CMakeFiles/BasicPublisher.dir/BasicPublisher.cpp.o
Linking CXX executable ../../../../binaries/BasicPublisher
../../../../lib/libaeron_client.a(ClientConductor.cpp.o): In function `aeron::common::concurrent::ringbuffer::RecordDescriptor::checkMsgTypeId(int)':
/shared/Aeron/aeron-common/src/main/cpp/concurrent/ringbuffer/RecordDescriptor.h:69: undefined reference to `aeron::common::util::strPrintf(char const*, ...)'
../../../../lib/libaeron_client.a(ClientConductor.cpp.o): In function `aeron::common::concurrent::ringbuffer::ManyToOneRingBuffer::checkMsgLength(int) const':
/shared/Aeron/aeron-common/src/main/cpp/concurrent/ringbuffer/ManyToOneRingBuffer.h:219: undefined reference to `aeron::common::util::strPrintf(char const*, ...)'
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function `aeron::common::concurrent::broadcast::BroadcastBufferDescriptor::checkCapacity(int)':
/shared/Aeron/aeron-common/src/main/cpp/concurrent/broadcast/BroadcastBufferDescriptor.h:37: undefined reference to `aeron::common::util::strPrintf(char const*, ...)'
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function `aeron::common::concurrent::ringbuffer::RingBufferDescriptor::checkCapacity(int)':
/shared/Aeron/aeron-common/src/main/cpp/concurrent/ringbuffer/RingBufferDescriptor.h:42: undefined reference to `aeron::common::util::strPrintf(char const*, ...)'
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function `aeron::Aeron::createDriverProxy(aeron::Context&)':
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:70: undefined reference to `aeron::common::util::MemoryMappedFile::mapExisting(char const*)'
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:73: undefined reference to `aeron::common::util::MemoryMappedFile::getMemorySize() const'
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:73: undefined reference to `aeron::common::util::MemoryMappedFile::getMemoryPtr() const'
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function `aeron::Aeron::createDriverReceiver(aeron::Context&)':
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:90: undefined reference to `aeron::common::util::MemoryMappedFile::mapExisting(char const*)'
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:92: undefined reference to `aeron::common::util::MemoryMappedFile::getMemorySize() const'
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:92: undefined reference to `aeron::common::util::MemoryMappedFile::getMemoryPtr() const'
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function `aeron::common::concurrent::broadcast::CopyBroadcastReceiver::receive(std::function<void ()(int, aeron::common::concurrent::AtomicBuffer&, int, int)> const&)':
/shared/Aeron/aeron-common/src/main/cpp/concurrent/broadcast/CopyBroadcastReceiver.h:63: undefined reference to `aeron::common::util::strPrintf(char const*, ...)'
collect2: error: ld returned 1 exit status
make[2]: *** [binaries/BasicPublisher] Error 1
make[1]: *** [aeron-samples/src/main/cpp/CMakeFiles/BasicPublisher.dir/all] Error 2
make: *** [all] Error 2
So I re-ran with verbose turned on:
[/shared/Aeron/cppbuild] make VERBOSE=1
/shared/build/share/cmake/2.8.12.1/bin/cmake -H/shared/Aeron -B/shared/Aeron/cppbuild --check-build-system CMakeFiles/Makefile.cmake 0
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_start /shared/Aeron/cppbuild/CMakeFiles /shared/Aeron/cppbuild/CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory `/shared/Aeron/cppbuild'
make -f aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock.dir/build.make aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0 /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0 /shared/Aeron/cppbuild/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock.dir/build.make aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 21 22
[ 6%] Built target gmock
make -f aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock_main.dir/build.make aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock_main.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0 /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0 /shared/Aeron/cppbuild/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock_main.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock_main.dir/build.make aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock_main.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/CMakeFiles/gmock_main.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 23 24 25
[ 17%] Built target gmock_main
make -f aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest.dir/build.make aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest /shared/Aeron/cppbuild/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest.dir/build.make aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 26
[ 20%] Built target gtest
make -f aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest_main.dir/build.make aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest_main.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest /shared/Aeron/cppbuild/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest_main.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest_main.dir/build.make aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest_main.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/CMakeFiles/gtest_main.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 27
[ 24%] Built target gtest_main
make -f aeron-common/src/main/cpp/util/CMakeFiles/aeron_util.dir/build.make aeron-common/src/main/cpp/util/CMakeFiles/aeron_util.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-common/src/main/cpp/util /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-common/src/main/cpp/util /shared/Aeron/cppbuild/aeron-common/src/main/cpp/util/CMakeFiles/aeron_util.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-common/src/main/cpp/util/CMakeFiles/aeron_util.dir/build.make aeron-common/src/main/cpp/util/CMakeFiles/aeron_util.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-common/src/main/cpp/util/CMakeFiles/aeron_util.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 9 10 11 12
[ 37%] Built target aeron_util
make -f aeron-common/src/main/cpp/util/CMakeFiles/utilTests.dir/build.make aeron-common/src/main/cpp/util/CMakeFiles/utilTests.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-common/src/main/cpp/util /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-common/src/main/cpp/util /shared/Aeron/cppbuild/aeron-common/src/main/cpp/util/CMakeFiles/utilTests.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-common/src/main/cpp/util/CMakeFiles/utilTests.dir/build.make aeron-common/src/main/cpp/util/CMakeFiles/utilTests.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-common/src/main/cpp/util/CMakeFiles/utilTests.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 28 29
[ 44%] Built target utilTests
make -f aeron-common/src/main/cpp/concurrent/CMakeFiles/aeron_concurrent.dir/build.make aeron-common/src/main/cpp/concurrent/CMakeFiles/aeron_concurrent.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-common/src/main/cpp/concurrent /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-common/src/main/cpp/concurrent /shared/Aeron/cppbuild/aeron-common/src/main/cpp/concurrent/CMakeFiles/aeron_concurrent.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-common/src/main/cpp/concurrent/CMakeFiles/aeron_concurrent.dir/build.make aeron-common/src/main/cpp/concurrent/CMakeFiles/aeron_concurrent.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-common/src/main/cpp/concurrent/CMakeFiles/aeron_concurrent.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 7 8
[ 51%] Built target aeron_concurrent
make -f aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/build.make aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-common/src/main/cpp/concurrent /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-common/src/main/cpp/concurrent /shared/Aeron/cppbuild/aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/build.make aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 14 15 16 17 18 19 20
[ 75%] Built target concurrentTests
make -f aeron-common/src/main/cpp/command/CMakeFiles/commandTests.dir/build.make aeron-common/src/main/cpp/command/CMakeFiles/commandTests.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-common/src/main/cpp/command /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-common/src/main/cpp/command /shared/Aeron/cppbuild/aeron-common/src/main/cpp/command/CMakeFiles/commandTests.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-common/src/main/cpp/command/CMakeFiles/commandTests.dir/build.make aeron-common/src/main/cpp/command/CMakeFiles/commandTests.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-common/src/main/cpp/command/CMakeFiles/commandTests.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 13
[ 79%] Built target commandTests
make -f aeron-samples/src/main/cpp/CMakeFiles/AeronStat.dir/build.make aeron-samples/src/main/cpp/CMakeFiles/AeronStat.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-samples/src/main/cpp /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-samples/src/main/cpp /shared/Aeron/cppbuild/aeron-samples/src/main/cpp/CMakeFiles/AeronStat.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-samples/src/main/cpp/CMakeFiles/AeronStat.dir/build.make aeron-samples/src/main/cpp/CMakeFiles/AeronStat.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-samples/src/main/cpp/CMakeFiles/AeronStat.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 1
[ 82%] Built target AeronStat
make -f aeron-client/src/main/cpp/CMakeFiles/aeron_client.dir/build.make aeron-client/src/main/cpp/CMakeFiles/aeron_client.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-client/src/main/cpp /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-client/src/main/cpp /shared/Aeron/cppbuild/aeron-client/src/main/cpp/CMakeFiles/aeron_client.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-client/src/main/cpp/CMakeFiles/aeron_client.dir/build.make aeron-client/src/main/cpp/CMakeFiles/aeron_client.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
make[2]: Nothing to be done for `aeron-client/src/main/cpp/CMakeFiles/aeron_client.dir/build'.
make[2]: Leaving directory `/shared/Aeron/cppbuild'
/shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_progress_report /shared/Aeron/cppbuild/CMakeFiles 3 4 5 6
[ 96%] Built target aeron_client
make -f aeron-samples/src/main/cpp/CMakeFiles/BasicPublisher.dir/build.make aeron-samples/src/main/cpp/CMakeFiles/BasicPublisher.dir/depend
make[2]: Entering directory `/shared/Aeron/cppbuild'
cd /shared/Aeron/cppbuild && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_depends "Unix Makefiles" /shared/Aeron /shared/Aeron/aeron-samples/src/main/cpp /shared/Aeron/cppbuild /shared/Aeron/cppbuild/aeron-samples/src/main/cpp /shared/Aeron/cppbuild/aeron-samples/src/main/cpp/CMakeFiles/BasicPublisher.dir/DependInfo.cmake --color=
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make -f aeron-samples/src/main/cpp/CMakeFiles/BasicPublisher.dir/build.make aeron-samples/src/main/cpp/CMakeFiles/BasicPublisher.dir/build
make[2]: Entering directory `/shared/Aeron/cppbuild'
Linking CXX executable ../../../../binaries/BasicPublisher
cd /shared/Aeron/cppbuild/aeron-samples/src/main/cpp && /shared/build/share/cmake/2.8.12.1/bin/cmake -E cmake_link_script CMakeFiles/BasicPublisher.dir/link.txt --verbose=1
/build/share/gcc/4.9.0/bin/g++ -Wall -std=c++11 -fexceptions -pthread -g -m64 CMakeFiles/BasicPublisher.dir/BasicPublisher.cpp.o -o ../../../../binaries/BasicPublisher -rdynamic ../../../../lib/libaeron_concurrent.a ../../../../lib/libaeron_util.a ../../../../lib/libaeron_client.a
../../../../lib/libaeron_client.a(ClientConductor.cpp.o): In function `aeron::common::concurrent::ringbuffer::RecordDescriptor::checkMsgTypeId(int)':
/shared/Aeron/aeron-common/src/main/cpp/concurrent/ringbuffer/RecordDescriptor.h:69: undefined reference to `aeron::common::util::strPrintf(char const*, ...)'
../../../../lib/libaeron_client.a(ClientConductor.cpp.o): In function `aeron::common::concurrent::ringbuffer::ManyToOneRingBuffer::checkMsgLength(int) const':
/shared/Aeron/aeron-common/src/main/cpp/concurrent/ringbuffer/ManyToOneRingBuffer.h:219: undefined reference to `aeron::common::util::strPrintf(char const*, ...)'
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function `aeron::common::concurrent::broadcast::BroadcastBufferDescriptor::checkCapacity(int)':
/shared/Aeron/aeron-common/src/main/cpp/concurrent/broadcast/BroadcastBufferDescriptor.h:37: undefined reference to `aeron::common::util::strPrintf(char const*, ...)'
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function `aeron::common::concurrent::ringbuffer::RingBufferDescriptor::checkCapacity(int)':
/shared/Aeron/aeron-common/src/main/cpp/concurrent/ringbuffer/RingBufferDescriptor.h:42: undefined reference to `aeron::common::util::strPrintf(char const*, ...)'
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function `aeron::Aeron::createDriverProxy(aeron::Context&)':
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:70: undefined reference to `aeron::common::util::MemoryMappedFile::mapExisting(char const*)'
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:73: undefined reference to `aeron::common::util::MemoryMappedFile::getMemorySize() const'
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:73: undefined reference to `aeron::common::util::MemoryMappedFile::getMemoryPtr() const'
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function `aeron::Aeron::createDriverReceiver(aeron::Context&)':
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:90: undefined reference to `aeron::common::util::MemoryMappedFile::mapExisting(char const*)'
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:92: undefined reference to `aeron::common::util::MemoryMappedFile::getMemorySize() const'
/shared/Aeron/aeron-client/src/main/cpp/Aeron.cpp:92: undefined reference to `aeron::common::util::MemoryMappedFile::getMemoryPtr() const'
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function `aeron::common::concurrent::broadcast::CopyBroadcastReceiver::receive(std::function<void ()(int, aeron::common::concurrent::AtomicBuffer&, int, int)> const&)':
/shared/Aeron/aeron-common/src/main/cpp/concurrent/broadcast/CopyBroadcastReceiver.h:63: undefined reference to `aeron::common::util::strPrintf(char const*, ...)'
collect2: error: ld returned 1 exit status
make[2]: *** [binaries/BasicPublisher] Error 1
make[2]: Leaving directory `/shared/Aeron/cppbuild'
make[1]: *** [aeron-samples/src/main/cpp/CMakeFiles/BasicPublisher.dir/all] Error 2
make[1]: Leaving directory `/shared/Aeron/cppbuild'
make: *** [all] Error 2
Every Publication.offer() method could possibly call the nextTerm() method. The nextTerm() method in return calls the static TermHelper.ensureClean() method. Seems like the ensureClean() method is not wait-free:
public static boolean ensureClean(final LogBuffer logBuffer)
{
if (CLEAN != logBuffer.status())
{
if (logBuffer.compareAndSetStatus(NEEDS_CLEANING, IN_CLEANING))
{
logBuffer.clean(); // Conductor is not keeping up so do it yourself!!!
}
else
{
// This while loop does not seem to be wait-free. What if the thread which was cleaning gets suspended itself? What if it dies after managing to CAS the status to IN_CLEANING?
while (CLEAN != logBuffer.status())
{
Thread.yield();
}
}
return false;
}
return true;
}
Steps to reproduce:
Ping application should exit and have a JVM crash and the following exceptions:
java.lang.IllegalStateException: Attempting to signal when signal has already been raised
at uk.co.real_logic.aeron.conductor.Signal.signal(Signal.java:19)
at uk.co.real_logic.aeron.conductor.ClientConductor.operationSucceeded(ClientConductor.java:252)
at uk.co.real_logic.aeron.conductor.DriverListenerAdapter.onMessage(DriverListenerAdapter.java:86)
at uk.co.real_logic.aeron.common.concurrent.broadcast.CopyBroadcastReceiver.receive(CopyBroadcastReceiver.java:68)
at uk.co.real_logic.aeron.conductor.DriverListenerAdapter.receiveMessages(DriverListenerAdapter.java:54)
at uk.co.real_logic.aeron.conductor.ClientConductor.doWork(ClientConductor.java:106)
at uk.co.real_logic.aeron.common.Agent.run(Agent.java:60)
at java.lang.Thread.run(Thread.java:744)
java.lang.IllegalStateException: Attempting to signal when signal has already been raised
at uk.co.real_logic.aeron.conductor.Signal.signal(Signal.java:19)
at uk.co.real_logic.aeron.conductor.ClientConductor.operationSucceeded(ClientConductor.java:252)
at uk.co.real_logic.aeron.conductor.DriverListenerAdapter.onMessage(DriverListenerAdapter.java:86)
at uk.co.real_logic.aeron.common.concurrent.broadcast.CopyBroadcastReceiver.receive(CopyBroadcastReceiver.java:68)
at uk.co.real_logic.aeron.conductor.DriverListenerAdapter.receiveMessages(DriverListenerAdapter.java:54)
at uk.co.real_logic.aeron.conductor.ClientConductor.doWork(ClientConductor.java:106)
at uk.co.real_logic.aeron.common.Agent.run(Agent.java:60)
at java.lang.Thread.run(Thread.java:744)
Hello,
I don't see what prevents BroadcastTransmitter::transmit from overwriting it's underlying buffer in such a way that CopyBroadcastReceiver::receive returns invalid messages.
In more details:
BroadcastTransmitter::transmit stamps each message it transmits with the tail (tail stamp) counter then increment its tail.
CopyBroadcastReceiver::receive (via BroadcastReceiver::receiveNext) make sure it doesn't run over the tail then validates the message length, copy the message and check again if the message just copied is still valid by checking the message's (tail stamp).
I see a problem with that if memory gets overwrite by a new message while we copy the message in the receiver scratch buffer.In particular if the memory region associated with a message's (tail stamp) is now part of a new message's content. I don't see what prevents this memory region to contain the right (tail stamp) value. Meaning BroadcastReceiver::validate would return true.
Did I missed something or this isn't thread safe?
Sidney
We need to support the application of padding that is less than header size in the LogAppender.
I have the following use case: A fairly small number of sender client processes, say N processes each on their own host. A similar or slightly larger number of receivers processes, say M processes, again each on their own host. I have around 1024 topics (numbered 0 - 1024) and some external arbiter assigns these topics to the M receiver ( M <<< 1024) processes in some out of band way. This arbiter setup is part of the control plane and these assignments change infrequently. I have logic already to detect these changes and act on them. All the data plane processing (I hope to use Aeron for this) is idempotent too. My message velocity through the system is moderately high (in multi-millions of messages / second split between the N sender processes).
My current plan is to still have the 1024 topics in my application process, but create a stream per receiver process ( i.e M streams). I then plan to use the arbiter assigned (topic -> receiver host) mapping to assign each message to one of the the M streams. As the arbiter changes (topic -> receiver host) mappings, my receivers can update their subscriptions with each of the N sender processes. Each sender process thus has a stream per receiver process and each receiver is subscribed to it's corresponding stream on every sender process. Does that seem like a reasonable model? Any suggestions?
I was also wondering what I could do when the number of receiving processes M does get to a much higher number say a 100? Multiplexing these 100 receivers to a lower number of topics would be wasteful since it implies that each of these 100 receivers would then receive a lot of data that they do not require. Given my cluster wide multi-million messages / second scenario this could be a lot of waste.
New exception for the same scenario with the latest code: rerun Ping without restarting Pong and got a bunch of the following exceptions in the MediaDriver.
This is running the test on G8 + RHEL 6.5 + OpenOnload
java.lang.RuntimeException: java.nio.channels.ClosedChannelException
at uk.co.real_logic.aeron.driver.UdpTransport.sendTo(UdpTransport.java:211)
at uk.co.real_logic.aeron.driver.ReceiveChannelEndpoint.sendNak(ReceiveChannelEndpoint.java:206)
at uk.co.real_logic.aeron.driver.ReceiveChannelEndpoint.lambda$composeNakMessageSender$3(ReceiveChannelEndpoint.java:146)
at uk.co.real_logic.aeron.driver.ReceiveChannelEndpoint$$Lambda$53/968457309.send(Unknown Source)
at uk.co.real_logic.aeron.driver.LossHandler.sendNakMessage(LossHandler.java:227)
at uk.co.real_logic.aeron.driver.LossHandler.activateGap(LossHandler.java:214)
at uk.co.real_logic.aeron.driver.LossHandler.scan(LossHandler.java:124)
at uk.co.real_logic.aeron.driver.DriverConnection.scanForGaps(DriverConnection.java:231)
at uk.co.real_logic.aeron.driver.DriverConductor.doWork(DriverConductor.java:225)
at uk.co.real_logic.aeron.common.Agent.run(Agent.java:64)
at java.lang.Thread.run(Unknown Source)
Caused by: java.nio.channels.ClosedChannelException
at sun.nio.ch.DatagramChannelImpl.ensureOpen(Unknown Source)
at sun.nio.ch.DatagramChannelImpl.send(Unknown Source)
at uk.co.real_logic.aeron.driver.UdpTransport.sendTo(UdpTransport.java:207)
... 10 more
Start the driver:
$ java -cp aeron-samples/build/libs/samples.jar -Dagrona.disable.bounds.checks=true uk.co.real_logic.aeron.samples.LowLatencyMediaDriver
Start a bunch (10-20 or so) of these:
$ java -cp aeron-samples/build/libs/samples.jar -Dagrona.disable.bounds.checks=true -Daeron.sample.frameCountLimit=1024 uk.co.real_logic.aeron.samples.BasicSubscriber &
and a bunch of these (around 10-15):
java -cp aeron-samples/build/libs/samples.jar -Daeron.sample.messageLength=40 -Daeron.sample.messages=500000000 -Dagrona.disable.bounds.checks=true uk.co.real_logic.aeron.samples.BasicPublisher &
Let them get set up and start sending/receiving messages for a few seconds, then murder all the sources and receivers at once with something like this:
$ jps |grep Basic|cut -d ' ' -f 1|xargs kill
On the driver, I get an exception similar to the following:
[35015.120825] EXCEPTION [444/444]: java.lang.IndexOutOfBoundsException(Index: 45, Size: 43)
java.util.ArrayList.rangeCheck ArrayList.java:653
java.util.ArrayList.get ArrayList.java:429
uk.co.real_logic.aeron.driver.DriverConnection.updateSubscribersPosition DriverConnection.java:504
uk.co.real_logic.aeron.driver.DriverConductor.doWork DriverConductor.java:215
uk.co.real_logic.aeron.common.AgentRunner.run AgentRunner.java:85
Same exception from another run, with slightly different Index and Size:
[35299.517062] EXCEPTION [443/443]: java.lang.IndexOutOfBoundsException(Index: 11, Size: 7)
java.util.ArrayList.rangeCheck ArrayList.java:653
java.util.ArrayList.get ArrayList.java:429
uk.co.real_logic.aeron.driver.DriverConnection.updateSubscribersPosition DriverConnection.java:504
uk.co.real_logic.aeron.driver.DriverConductor.doWork DriverConductor.java:215
uk.co.real_logic.aeron.common.AgentRunner.run AgentRunner.java:85
or why did the Media Driver flatten my laptops battery...
If the Media Driver is started on a MacBook Pro running OS X 10.10.1 as per below the "Energy Impact" jumps to over 200 on the activity monitor - even when no data is being transferred.This has the effect of reducing time on battery by 4-5x (~2hrs down from 8).
Whilst the power usage is maybe acceptable in server environment current usage makes it impractical in a mobile context. Is there a way or change to make the Media Driver work in a power efficient way?
java -cp aeron-samples/build/libs/samples.jar -Daeron.mtu.length=16384 -Daeronjava -cp aeron-samples/build/libs/samples.jar -Daeron.mtu.length=16384 -Daeron.socket.so_sndbuf=16384 -Daeron.rcv.initial.window.size=2097152 -Daeron.socket.so_rcvbuf=2097152 -Daeron.rcv.buffer.size=16384 -Dagrona.disable.bounds.checks=true uk.co.real_logic.aeron.samples.LowLatencyMediaDriver
OS (uname -a): Linux 3.5.0-51-generic #77~precise1-Ubuntu SMP Thu Jun 5 00:48:28 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Thanks for open sourcing this library. I hope to learn a lot about high performance messaging from it.
I noticed that the ReceiverTest tearDown() method was throwing NPEs when closing the receiveChannelEndpoint so I wrapped the creation of the ReceiveChannelEndpoint in the ReceiverTest's setUp() method with a try/catch that printed the exception and I got this:
Exception: java.lang.IllegalStateException: Failed to set SO_RCVBUF: attempted=131072, actual=131071
Looks like UdpChannelTransport's constructor is throwing the exception.
% sysctl net | grep 131071
net.core.rmem_max = 131071
net.core.wmem_max = 131071
Looks like there's a hard limit we're hitting up against.
Thanks!
Curious what platforms you've tested on. I'm seeing what look like build problems on both java and cpp side.
For Java, everything seems to build OK, but a number of tests fail with AssertionError, Wanted but not invoked, and BindException: Address already in use. This with jdk1.8.0_25.
The cppbuild is failing on "INT_MAX not declared" in ManyToOneRingBuffer.h, and "atomic is not a member of std" in testManyToOneRingBuffer.cpp. This with gcc 4.9.0.
All of the above on CentOS 6.5.
Again, would love to get an idea what platforms are known to work before I dive down the rabbit hole. TIA!
It's been identified that it is possible for the driver to delete log/state files out from under a client when a publication times out. The reaping of inactive connections can cause the driver to delete the file. Or fail to delete the file and thus orphan it.
A potential solution is the following protocol:
DriverConnection
to a different array than the connections
arrayON_INACTIVE_CONNECTION
message to the clients.As the driver currently has keepalives from the clients now, it can determine if it can reap these files already as normal. So, all bases should be covered if a client goes away while handling the above exchange.
Thoughts?
When running a client that has no embedded driver and no external media driver is running, an exception similar to the one below will be seen.
Exception in thread "main" java.lang.IllegalStateException: Missing file for to-clients: /var/folders/t5/_bkmpxv9155fh62xs557zzch0000gn/T/aeron/conductor/to-clients
at uk.co.real_logic.aeron.common.IoUtil.checkFileExists(IoUtil.java:274)
at uk.co.real_logic.aeron.common.IoUtil.mapExistingFile(IoUtil.java:208)
at uk.co.real_logic.aeron.Aeron$Context.conclude(Aeron.java:208)
at uk.co.real_logic.aeron.Aeron.<init>(Aeron.java:56)
at uk.co.real_logic.aeron.Aeron.connect(Aeron.java:101)
at uk.co.real_logic.aeron.examples.RateSubscriber.main(RateSubscriber.java:57)
It would be good to catch this and handle it more gracefully.
This might just be a bug in the sample app itself, but...
$ java -cp aeron-samples/build/libs/samples.jar uk.co.real_logic.aeron.driver.MediaDriver
$ java -cp aeron-samples/build/libs/samples.jar -Daeron.sample.messageLength=0 uk.co.real_logic.aeron.samples.reamingPublisher
Streaming 1,000,000 messages of size 0 bytes to udp://localhost:40123 on stream Id 10
Exception in thread "main" java.lang.IndexOutOfBoundsException: index=0, length=8, capacity=0
at uk.co.real_logic.agrona.concurrent.UnsafeBuffer.boundsCheck(UnsafeBuffer.java:795)
at uk.co.real_logic.agrona.concurrent.UnsafeBuffer.putLong(UnsafeBuffer.java:322)
at uk.co.real_logic.aeron.samples.StreamingPublisher.main(StreamingPublisher.java:77)
I imagine that if -Dagrona.disable.bounds.checks=true
was specified on the publisher, it might well crash as well.
Need to protect the shared directories.
There appears to be a cyclic dependency between packages uk.co.real_logic.aeron.driver and uk.co.real_logic.aeron.driver.cmd. Not sure if this is an issue, but it surely breaks the acyclic dependencies principle as formulated by Robert C. Martin in his seminal work "Agile Software Development, Principles, Patterns, and Practices". See the pic below, obtained by an automated ( scripted ) package dependency detection with Sparx Enterprise Architect, after having reverse-engineered Aeron into it.
Hi, are there any plans to publish Aeron (+Agrona) as maven artifacts ? Would ease integration + use amongst several developers/locations.
Hi,
I am running a test with 'embedded' media driver option on localhost. While its possible to start several publishers on a single machine, if I try to add a second Subscriber to a channel I'll get an exception 'Address already in use', which somewhat confuses me as I'd expect Aeron uses multicast under the hood (?), so more than one subscriber should be possible.
So in order to have several subscribed processes on a single machine on cannot have 'embedded driver' or am I completely off ?
given that you guys know Gradle good enough to make the Agrona installable to local maven repository, any chance to do the same for aeron-common,aeron-driver and aeron-client sub-projects?
be honest, I tried to change build.gradle, add lines such as apply plugin: 'maven'
, defaultTasks 'install'
etc. Apparently, it needs group=
and version=
as well. I have no talent in Gradle. :-P
Local fields of type IdleStrategy and Agent hide absolutely identical instance fields. Unnecessary reference creation, IMHO.
(Setup: your BasicXX example classes using multicast addresses [JDK 1.8_11 Ubuntu 14.04, AMD Opteron]).
Observation:
How large will the replay for the 1st subscriber become ? In case of a high volume publisher e.g. started some hours before the first subscriber, 1st subscriber might get flooded with messages. Can I suppress replay upon join somehow ?
git code version: 71b0f30
Building CXX object aeron-common/src/main/cpp/concurrent/CMakeFiles/concurrentTests.dir/tests/testCountersManager.cpp.o
In file included from /var/abs/local/libaeron-git/src/Aeron/aeron-common/src/main/cpp/command/tests/testCommand.cpp:18:0:
/var/abs/local/libaeron-git/src/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In instantiation of ‘testing::AssertionResult testing::internal::CmpHelperEQ(const char_, const char_, const T1&, const T2&) [with T1 = int; T2 = long unsigned int]’:
/var/abs/local/libaeron-git/src/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1485:30: required from ‘static testing::AssertionResult testing::internal::EqHelper<lhs_is_null_literal>::Compare(const char_, const char_, const T1&, const T2&) [with T1 = int; T2 = long unsigned int; bool lhs_is_null_literal = false]’
/var/abs/local/libaeron-git/src/Aeron/aeron-common/src/main/cpp/command/tests/testCommand.cpp:81:5: required from here
/var/abs/local/libaeron-git/src/Aeron/aeron-common/src/main/cpp/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1448:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (expected == actual) {
^
If we add a publication, offer messages, then close. The publication should close on the driver side after the data is sent. It can be unmapped on the client side but that is independent. So we need to mark publications as closed in the state, then when the ref count goes to zero then close it when the messages are sent and we have lingered sufficiently long to handle potential loss.
The java build fails on both OSX 10.10 and CentOS 6.5 with the following -- any ideas?
TIA...
btmacpro:temp btorpey$ git clone https://github.com/real-logic/Aeron
Cloning into 'Aeron'...
remote: Counting objects: 45598, done.
remote: Compressing objects: 100% (254/254), done.
remote: Total 45598 (delta 113), reused 0 (delta 0)
Receiving objects: 100% (45598/45598), 6.33 MiB | 4.87 MiB/s, done.
Resolving deltas: 100% (17693/17693), done.
Checking connectivity... done.
btmacpro:temp btorpey$ cd Aeron
btmacpro:Aeron btorpey$ ./gradlew
:clean UP-TO-DATE
:aeron-client:clean UP-TO-DATE
:aeron-common:clean UP-TO-DATE
:aeron-driver:clean UP-TO-DATE
:aeron-samples:clean UP-TO-DATE
:aeron-system-tests:clean UP-TO-DATE
:aeron-common:compileJava
/Users/btorpey/nyx/temp/Aeron/aeron-common/src/main/java/uk/co/real_logic/aeron/common/concurrent/logbuffer/LogBufferDescriptor.java:21: error: cannot find symbol
import static uk.co.real_logic.agrona.BitUtil.CACHE_LINE_LENGTH;
^
symbol: static CACHE_LINE_LENGTH
location: class
/Users/btorpey/nyx/temp/Aeron/aeron-common/src/main/java/uk/co/real_logic/aeron/common/concurrent/logbuffer/LogBufferDescriptor.java:97: error: cannot find symbol
TERM_STATUS_OFFSET = TERM_TAIL_COUNTER_OFFSET + CACHE_LINE_LENGTH;
^
symbol: variable CACHE_LINE_LENGTH
location: class LogBufferDescriptor
/Users/btorpey/nyx/temp/Aeron/aeron-common/src/main/java/uk/co/real_logic/aeron/common/concurrent/logbuffer/LogBufferDescriptor.java:98: error: cannot find symbol
TERM_META_DATA_LENGTH = CACHE_LINE_LENGTH * 2;
^
symbol: variable CACHE_LINE_LENGTH
location: class LogBufferDescriptor
/Users/btorpey/nyx/temp/Aeron/aeron-common/src/main/java/uk/co/real_logic/aeron/common/concurrent/logbuffer/LogBufferDescriptor.java:118: error: cannot find symbol
public static final int LOG_META_DATA_LENGTH = CACHE_LINE_LENGTH;
^
symbol: variable CACHE_LINE_LENGTH
location: class LogBufferDescriptor
4 errors
:aeron-common:compileJava FAILED
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':aeron-common:compileJava'.
> Compilation failed with exit code 1; see the compiler error output for details.
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.
BUILD FAILED
Total time: 14.275 secs
Steps to reproduce.
ExampleSubscriber
ExamplePublisher
ExampleSubscriber
After roughly 20 seconds, you will see an inactive connections notification on the subscriber and the stream will stop.
After roughly another 10 seconds, the driver will display ERROR_DELETING_FILE
events.
What is happening is that the orphaned Connection from the initial subscriber is still around and the term buffer files are reused for the next connection on the new subscription.
Hi, thanks for this very interesting project! I tried to compile it (master branch, at commit 1c252f2) on Debian linux with gcc 4.9.1, and the compilation failed:
Linking CXX executable ../../../../binaries/BasicPublisher
../../../../lib/libaeron_client.a(Aeron.cpp.o): In function std::thread::thread<aeron::common::common::AgentRunner<aeron::ClientConductor, aeron::common::common::BusySpinIdleStrategy>::start()::{lambda()#1}>(aeron::common::common::AgentRunner<aeron::ClientConductor, aeron::common::common::BusySpinIdleStrategy>::start()::{lambda()#1}&&)': /usr/include/c++/4.9/thread:136: undefined reference to
pthread_create'
collect2: error: ld returned 1 exit status
Adding -pthread flag to CMAKE_CXX_FLAGS fixed the issue.
Cloning from ba0b5a2
the gradlew build fails as per below.
Build done on a new Mac running OS X 10.10.1 and JDK 1.8.0_25
uk.co.real_logic.aeron.driver.UdpChannelTest > shouldHandleCanonicalFormForMulticastCorrectly FAILED
java.lang.AssertionError:
Expected: is <name:en5 (en5)>
but: was <name:vmnet8 (vmnet8)>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at uk.co.real_logic.aeron.driver.UdpChannelTest.shouldHandleCanonicalFormForMulticastCorrectly(UdpChannelTest.java:265)
uk.co.real_logic.aeron.driver.UdpChannelTest > shouldHandleCanonicalFormForMulticastCorrectlyWithAeronUri FAILED
java.lang.AssertionError:
Expected: is <name:en5 (en5)>
but: was <name:vmnet8 (vmnet8)>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
at uk.co.real_logic.aeron.driver.UdpChannelTest.shouldHandleCanonicalFormForMulticastCorrectlyWithAeronUri(UdpChannelTest.java:284)
uk.co.real_logic.aeron.driver.UdpChannelTest > shouldHandleIpV6CanonicalFormForMulticastCorrectly SKIPPED
Results: FAILURE (89 tests, 86 successes, 2 failures, 1 skipped)
89 tests completed, 2 failed, 1 skipped
:aeron-driver:test FAILED
In Publication.java every offer, could potentially lead to a term being rotated. The active LogAppender is maintained via the activeIndex
variable. activeIndex
gets changed by the nextTerm() method. activeIndex
though seems like a regular field. How does the nextTerm()
method called on one thread make it's activeIndex
change visible to another thread when it calls the offer()
method?
I'm getting a java.lang.IllegalArgumentException: encoded message exceeds maxMessageLength of 65536
error when trying to send a large message to an UDP address. Is the max message length configurable?
Is there any channel for ask general questions?
I didn't find them in the Readme
Is it fine use issues as questions ?
Tnks, bye
P.S sorry for the non code related issue :)
While Playing with Ping/Pong tests I have stumbled on a reproducible error:
After I run a ping/pong over the network, after I try to reverse directions by running Pong on the host that was used for Ping previously I get the following exception in the client:
Exception in thread "main" uk.co.real_logic.aeron.exceptions.RegistrationException: java.net.BindException: Cannot assign requested address
at uk.co.real_logic.aeron.ClientConductor.onError(ClientConductor.java:259)
at uk.co.real_logic.aeron.DriverListenerAdapter.onError(DriverListenerAdapter.java:119)
at uk.co.real_logic.aeron.DriverListenerAdapter.onMessage(DriverListenerAdapter.java:104)
at uk.co.real_logic.aeron.common.concurrent.broadcast.CopyBroadcastReceiver.receive(CopyBroadcastReceiver.java:68)
at uk.co.real_logic.aeron.DriverListenerAdapter.receiveMessages(DriverListenerAdapter.java:54)
at uk.co.real_logic.aeron.ClientConductor.doWork(ClientConductor.java:108)
at uk.co.real_logic.aeron.common.Agent.run(Agent.java:77)
at java.lang.Thread.run(Unknown Source)
And this exception in the MediaDriver:
[3014536.239502] EXCEPTION [546/546]: java.lang.RuntimeException(java.net.BindException: Cannot assign requested address)
uk.co.real_logic.aeron.driver.UdpTransport.<init> UdpTransport.java:185
uk.co.real_logic.aeron.driver.UdpTransport.<init> UdpTransport.java:83
uk.co.real_logic.aeron.driver.ReceiveChannelEndpoint.<init> ReceiveChannelEndpoint.java:56
uk.co.real_logic.aeron.driver.DriverConductor.onAddSubscription DriverConductor.java:468
uk.co.real_logic.aeron.driver.DriverConductor.onClientCommand DriverConductor.java:292
This happens even after I remove /dev/shm/aeron
:aeron-driver:test
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldOnlyRemoveSubscriptionMediaEndpointUponRemovalOfAllSubscribers FAILED
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertNotNull(Assert.java:621)
at org.junit.Assert.assertNotNull(Assert.java:631)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldOnlyRemoveSub
scriptionMediaEndpointUponRemovalOfAllSubscribers(DriverConductorTest.java:287)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldNotTimeoutSubscriptionOnKeepAlive FAILED
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertNotNull(Assert.java:621)
at org.junit.Assert.assertNotNull(Assert.java:631)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldNotTimeoutSub
scriptionOnKeepAlive(DriverConductorTest.java:409)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldKeepSubscriptionMediaEndpointUponRemovalOfAllButOneSubscriber FAILED
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertNotNull(Assert.java:621)
at org.junit.Assert.assertNotNull(Assert.java:631)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldKeepSubscript
ionMediaEndpointUponRemovalOfAllButOneSubscriber(DriverConductorTest.java:262)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldTimeoutSubscription FA
ILED
Wanted but not invoked:
receiverProxy.addSubscription(, 10);
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.verifyReceiverSubscr
ibes(DriverConductorTest.java:433)
Actually, there were zero interactions with this mock.
at uk.co.real_logic.aeron.driver.DriverConductorTest.verifyReceiverSubsc
ribes(DriverConductorTest.java:433)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldTimeoutSubscr
iption(DriverConductorTest.java:393)
uk.co.real_logic.aeron.driver.DriverConductorTest > shouldBeAbleToAddSingleSubscription FAILED
Wanted but not invoked:
clientProxy.operationSucceeded(1429);
-> at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToAddSingleSubscription(DriverConductorTest.java:177)
However, there were other interactions with this mock:
-> at uk.co.real_logic.aeron.driver.DriverConductor.onClientCommand(DriverConductor.java:333)
at uk.co.real_logic.aeron.driver.DriverConductorTest.shouldBeAbleToAddSingleSubscription(DriverConuctorTest.java:177)
Results: FAILURE (77 tests, 72 successes, 5 failures, 0 skipped)
77 tests completed, 5 failed
:aeron-driver:test FAILED
I am trying to figure out how to implement following:
1). infinite persistable retransmission buffer
multicast publisher without any limitation by receivers
receivers "late join" handled somehow (with unicast retransmits like on naks or smthing custom)
persistent means that i can stop/start publisher application and continue publish where I was finished.
I agree to pay some price for that, in general I see custom buffer/term management where they not rotated but asynchronously written to disk.
Are there some extension points I can play with?
2). handle/manage receiver cumulative acks after message processing by app on subscriber side in consistent but more or less cheap way on publisher
use receiver status message?
separate channel for acks?
Could you please advice if it is achievable/reasonable/applicable to Aeron at all.
Or just point me direction/extension points where to dig.
Thanks in advance.
This is on my dial core MBP, when running Ping/Pong on localhost. Not reproducible every time.
Ping stopped with:
Publishing Ping at udp://localhost:40123 on stream Id 10
Subscribing Pong at udp://localhost:40124 on stream Id 10
Warming up... 10000 messages
Warm now.
Lingering for 5000 milliseconds...
Pinging 10000 messages
Done streaming.
Lingering for 5000 milliseconds...
Exception in thread "main" java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 21576
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at uk.co.real_logic.aeron.examples.Ping.main(Ping.java:122)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 21576
at org.HdrHistogram.Histogram.incrementCountAtIndex(Histogram.java:54)
at org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:285)
at org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:216)
at uk.co.real_logic.aeron.examples.Ping.pongHandler(Ping.java:139)
at uk.co.real_logic.aeron.examples.Ping$$Lambda$7/1368884364.onData(Unknown Source)
at uk.co.real_logic.aeron.Connection.onFrame(Connection.java:107)
at uk.co.real_logic.aeron.Connection$$Lambda$11/906233096.onFrame(Unknown Source)
at uk.co.real_logic.aeron.common.concurrent.logbuffer.LogReader.read(LogReader.java:94)
at uk.co.real_logic.aeron.Connection.poll(Connection.java:93)
at uk.co.real_logic.aeron.Subscription$$Lambda$9/891774719.apply(Unknown Source)
at uk.co.real_logic.aeron.common.concurrent.AtomicArray.doLimitedAction(AtomicArray.java:185)
at uk.co.real_logic.aeron.Subscription.poll(Subscription.java:97)
at uk.co.real_logic.aeron.examples.Ping.runSubscriber(Ping.java:148)
at uk.co.real_logic.aeron.examples.Ping.lambda$main$4(Ping.java:75)
at uk.co.real_logic.aeron.examples.Ping$$Lambda$8/2108649164.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Then when rerunning it without stopping either Pong or MediaDriver:
Type 'quit' to terminate : Exception in thread "aeron-driver-conductor" java.lang.NullPointerException
at uk.co.real_logic.aeron.common.concurrent.AtomicBuffer.putString(AtomicBuffer.java:1021)
at uk.co.real_logic.aeron.common.concurrent.AtomicBuffer.putString(AtomicBuffer.java:1016)
at uk.co.real_logic.aeron.common.event.EventCodec.putStackTraceElement(EventCodec.java:149)
at uk.co.real_logic.aeron.common.event.EventCodec.encode(EventCodec.java:133)
at uk.co.real_logic.aeron.common.event.EventLogger.logException(EventLogger.java:164)
at uk.co.real_logic.aeron.driver.MediaDriver$Context$$Lambda$32/1190654826.accept(Unknown Source)
at uk.co.real_logic.aeron.common.Agent.run(Agent.java:69)
at java.lang.Thread.run(Thread.java:745)
From the wiki:
private static final UnsafeBuffer BUFFER = new UnsafeBuffer(ByteBuffer.allocateDirect(256));
...
final String message = "Hello World!";
BUFFER.putBytes(0, message.getBytes());
final boolean result = publication.offer(BUFFER, 0, message.getBytes().length);
Here one is allocating a buffer, and then the publication is presumably appending the bytes to the log buffer. From the StrangeLoop presentation it seemed like one would move the tail of the log buffer with a LOCK XADD and scribble their data straight onto the log buffer.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.