coherence-community / coherence-incubator Goto Github PK
View Code? Open in Web Editor NEWCoherence Incubator
Coherence Incubator
While running the Push Replication tests I noticed the following:
"The parent directory of the specified log file "/Users/narliss/dev/git/coherence-incubator/coherence-pushreplicationpattern/testActiveActiveCR-NY.log" does not exist; using System.out for log output instead."
Looking at the AbstractPushReplicationTest I found that we have a system log setting to use this file. This needs to be removed.
Currently we support one or the other, not both.
By supporting both it makes batch process and individual processing possible.
It can sometimes be useful to restrict the amount of subscribers in a Queue destination. An example of that would be to create multiple queues with a single subscriber in each one, and hash an object to each one (something which can be a weak guarantee of in-order processing for multiple version of the same object).
As outlined here: https://forums.oracle.com/forums/thread.jspa?threadID=2469183&tstart=0
This is an easy fix
Create a how to for coherence-community developers
Migrate the messaging pattern to 12.1.2
Migrate the EventDistribution pattern to 12.1.2
To be able to redeploy a cluster member correctly you need to detach the member from the cluster programmatically calling CacheFactory.shutdown() when application is undeployed. Then the method CacheFactory.shutdown() will call the stop() methods of the distributed services that runs on the leaving member.
Because the stop() method is run by a service thread, no reentrant service calls should be invoked inside the stop method to avoid deadlocks.
THE PROBLEM
The CommandExecutor.stop() have a CacheFactory.ensureCluster() that is a service call within a service call (thus, a reentrant call)
public void stop() {
if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, "Stopping CommandExecutor for %s", contextIdentifier);
//stop immediately setState(State.Stopped);
//this CommandExecutor must not be available any further to other threads CommandExecutorManager.removeCommandExecutor(this.getContextIdentifier());
//unregister JMX mbean for the CommandExecutor Registry registry = CacheFactory.ensureCluster().getManagement(); // THIS IS THE SERVICE CALL if (registry != null) {
if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, "Unregistering JMX management extensions for CommandExecutor %s", contextIdentifier);
registry.unregister(getMBeanName());
}
if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, "Stopped CommandExecutor for %s", contextIdentifier);
}
If the distributed service use to support the command pattern is configured to have a single thread (as it is by default). This call will produce a deadlock with a thread dump like this:
Thread[DistributedCache:DistributedCacheForCommandPattern|SERVICE_STOPPING,5,Cluster]
com.tangosol.net.CacheFactory.ensureCluster(CacheFactory.java:424)
com.oracle.coherence.patterns.command.internal.CommandExecutor.stop(CommandExecutor.java:671)
...
DIAGNOSTIC (AND POTENTIAL SOLUTION)
I've changed the code of the CommandExecutor.stop() method to use a non blocking service call to obtain the Cluster
Registry registry = CacheFactory.getCluster() != null ? CacheFactory.getCluster().getManagement() : null;
Because CacheFactory.getCluster() is not a blocking service call the deadlock is avoided.
Ensure we have social media announcements for incubator 11
As reported by Richard Carless:
@Test
public void testSimpleJNDILookup() throws NamingException
{
System.setProperty(Context.INITIAL_CONTEXT_FACTORY, "com.sun.jndi.dns.DnsContextFactory");
Context ctx = new InitialContext();
// Assert.assertNotNull(ctx.lookup("dns:///www.oracle.com")); /// <--- FAILS
}
Instead we should change this to use a local address or a local file system.
Also... com.sun.jndi.dns.DnsContextFactory is an internal class and hence the generated warnings would go away.
For Incubator 12 we've decided to separate out the LiveObject pattern into its own module (from Commons), with separate documentation, testing etc.
This will make it much more easily consumable.
We need to make sure that #48 does not occur in other patterns. From what I can tell, it looks like the Processing Pattern is the only other place where this possibly happens.
To be able to redeploy a cluster member correctly you need to detach the member from the cluster programmatically calling CacheFactory.shutdown() when application is undeployed. Then the method CacheFactory.shutdown() will call the stop() methods of the distributed services that runs on the leaving member.
Because the stop() method is run by a service thread, no reentrant service calls should be invoked inside the stop method to avoid deadlocks.
THE PROBLEM
The CommandExecutor.stop() have a CacheFactory.ensureCluster() that is a service call within a service call (thus, a reentrant call)public void stop() { if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, "Stopping CommandExecutor for %s", contextIdentifier); //stop immediately setState(State.Stopped); //this CommandExecutor must not be available any further to other threads CommandExecutorManager.removeCommandExecutor(this.getContextIdentifier()); //unregister JMX mbean for the CommandExecutor Registry registry = CacheFactory.ensureCluster().getManagement(); // THIS IS THE SERVICE CALL if (registry != null) { if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, "Unregistering JMX management extensions for CommandExecutor %s", contextIdentifier); registry.unregister(getMBeanName()); } if (Logger.isEnabled(Logger.DEBUG)) Logger.log(Logger.DEBUG, "Stopped CommandExecutor for %s", contextIdentifier); }
If the distributed service use to support the command pattern is configured to have a single thread (as it is by default). This call will produce a deadlock with a thread dump like this:
Thread[DistributedCache:DistributedCacheForCommandPattern|SERVICE_STOPPING,5,Cluster] com.tangosol.net.CacheFactory.ensureCluster(CacheFactory.java:424) com.oracle.coherence.patterns.command.internal.CommandExecutor.stop(CommandExecutor.java:671) ...
DIAGNOSTIC (AND POTENTIAL SOLUTION)
I've changed the code of the CommandExecutor.stop() method to use a non blocking service call to obtain the ClusterRegistry registry = CacheFactory.getCluster() != null ? CacheFactory.getCluster().getManagement() : null;
Because CacheFactory.getCluster() is not a blocking service call the deadlock is avoided.
We have a large number of tests that we should migrate over to use JUnit instead of the internal Oracle Framework.
It would be nice to have these as part of our automated builds instead of running them manually. Additionally it would be nice for the public to have access to them (because they can't access TestLogics and the testing infrastructure we have)
Convert the messaging pattern to use coherence namespaces and config functionality
From the Oracle Forum: https://forums.oracle.com/forums/thread.jspa?threadID=2486558&tstart=0
I am trying to use static class factories and I found that StaticFactoryClassSchemeBasedParameterizedBuilder.writeExternal(...) is incorrect.
/**
* {@inheritDoc}
*/
@Override
public void writeExternal(PofWriter writer) throws IOException
{
writer.writeObject(1, factoryClassName);
writer.writeObject(2, factoryMethodName);
writer.writeObject(2, parameters);
}
when it should be writer.writeObject(3,parameters) (in readExternal it is 3)
As discovered by Rich Carless, our current JMS dependency isn't on a jar-based maven artifact. It's simply a pom. This means that unless someone has the JMS jar installed, the EventDistributionPattern won't build.
Provide a way to programmatically determine if publishing for a cache is on or off.
This should be done in the event-distributor.
Ideally what we'd like to see is some kind of Failure Policy (call back) that provides:
a). the number of consecutive failures (thus far).
b). the amount of time those failures have occurred over.
c). the total number of failures.
d). the amount of time those failures have occurred over.
e). the amount of events currently queued.
f). the exception that occurred / caused the failure.
g). an option over what to do next, those being; "continue" to distributed events, "suspend" distribution of events, "stop" distribution of events.
Using this developers can control and intercept what should happen when failures occur.
The default FailurePolicy would be implemented much as we do now = suspend after a number of consecutive failures.
We need to update to support the latest github API - without this we can't publish documentation
During a code review we discovered the requirement the LifecycleAwareEvents don't provide a callback should an Event cause an error/exception within a NonBlockingFiniteStateMachine.
There are two issues:
1). The interface requires the method
2). The existing methods don't provide information about the State they are in.
This small (breaking) change resolves this issue.
Create a checkstyle project for the coherence community
We need to add the following script to the site.xml:
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-39051314-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
We should then re-release/update the 11.0.0 source code.
We need to announce the Incubator Release on the Incubator Forums
Blog about the incubator 11 release
As reported here:
https://forums.oracle.com/forums/thread.jspa?threadID=2509074&tstart=0
After doing some testing I discovered a "todo" in the XmlPreprocessingNamespaceHandler.mergeCacheConfig() method that describes how it should handle non-Coherence-based XML Namespaces (or rather does not handle) i.e. anything not a cache mapping or cache scheme.
Here's my suggestion. Probably the best place to handle this is by deferring it to the foreign (i.e. other namespace) NamespaceHandlers themselves. We could do this by introducing an optional interface for them to implement, that allows them to perform customized "merging"
Migrate the Processing Pattern to 12.1.2
We've done this for a few modules, but it would be nice if all were done the same way.
That is:
As reported by Rich Carless:
I've been trying to run the samples from the incubator, however there is a bug in the application launcher code.
The problem is in Runner.java, the following line will try to print out the arguments in System.Properties
System.out.printf("Using System Properties : " + System.getProperties() + "\n");
However, because printf is used it tries to interpret the strings inside the system properties. So if there is a \s or any kind of escape sequence the code will throw an exception. Under windows the following system properties cause a problem
:\Windows%
system32;I have changed the code to use println and all is fine.
Convert all BackingMapListener usages to use UEM events
Using GraphViz we can easily support rendering our Finite State Machines as UML diagrams.
Create automated builds for incubator 11 and 12 out of git
Convert the messagsing pattern's use of BackingMapListeners to use UEM events
Without this we can't really release coherence-incubator-all to maven.java.net as it forces us to release sources and javadoc.
On some platforms (perhaps Oracle Enterprise Linux?), virtualized network infrastructure doesn't correctly respect Java network settings and thus correctly isolate Coherence clusters.
This may cause unexpected Coherence clustering to occur which may lead to test failures.
We need to make sure that we isolate every test (perhaps turn off clustering) to ensure we can build on those platforms.
For the most part this is not a problem on: Mac OS X, Ubuntu and Windows (XP or Vista) with Java 6, 7 or 8.
Convert the processing pattern initialization/lifecycle framework to depend on UEM and activation events
Re-add the coherence*web examples back into inc-11
Randy suggested that we should move to version 4 from version 3
As discovered by Reon Campell, for some reason Push Replication / Event Distribution unnecessarily is deserializing events during the replication process. There is no requirement for this - unless a custom Transformer / Conflict Resolver is being used.
This issue makes it hard to use pure C++/.net-based applications as it forces developers to implement Java server-side classes.
As identified by Hysun He here:
https://forums.oracle.com/forums/thread.jspa?threadID=2503642&tstart=0
This appears to be a problem in that their's no base-case to prevent infinite sums occurring. This simply requires fixing the ConflictResolver.
Given that we have support for Partition-level Transactional Events, we should add support for this when using LiveObjects.
ie: We should provide the ability for a LiveObject to handle the transaction in which it is committed and committed.
@OnCommitting
public void onCommitting(Set<BinaryEntry> entries);
@OnCommitted
public void onCommitting(Set entries);
We need to upgrade to use Oracle Tools 1.0.0 as this provides several fixes and introduces support for improved Coherence-based JUnit Test Cluster Isolation.
Currently the LiveObject annotated methods receive the underlying LiveEvent Event, that of which includes all of the entries. This is ok for simple events, but for those that contain multiple entries, each LiveObject receives every entry. This is undesirable as it's next to impossible to correlate the LiveObject and the entry.
Instead we should change the signature of the annotated methods to be:
public void onEvent(BinaryEntry entry);
And then change the LiveObjectEventInterceptor to pass in only appropriate BinaryEntry to the LiveObject.
Jonathan Knight identified and provided some new tests for the ConfigurableCacheFactory that we should include in the Incubator, especially Incubator 12 that uses a different CCF implementation.
Convert the processing pattern to use coherence namespaces and config functionality
We should upgrade to use JUnit 4.10 and Hamcrest 1.3
During testing of rolling restarts with empty partitions we noticed that an NPE may be raised when using a LiveObject. This is caused by a for-loop not pre-checking for a null collection.
We should perhaps consider splitting the Amazon EC2 Cloud support integration into a separate Maven module as it doesn't really need to be part of Coherence Common.
eg: coherence-cloud-amazon
Fix Messaging Pattern Performance and throughput by reducing contention and serialization points
Migrate Push Replication Pattern to 12.1.2
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.