Coder Social home page Coder Social logo

artio's People

Contributors

alx-ut avatar amita396 avatar andreynudko avatar ariksher-nex avatar ashananin avatar brentdouglasb1 avatar clifford avatar dcullender-cb avatar dunkymole avatar guoguanglei avatar jpwatson avatar michaelszymczak avatar mjpt777 avatar motocodeltd avatar mprusakov-rbc avatar nickelaway avatar nickward avatar pcdv avatar rhbradford avatar richardwarburton avatar romanmarkunas avatar tmontgomery avatar tom-smalls avatar vdaniloff avatar vitor-tadashi avatar vyazelenko avatar wlukowicz avatar wojciech-adaptive avatar zachbray avatar zamhassam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

artio's Issues

Application Level Flow Control

DELTA the slow consumer from the normal position, offer this as application level flow control to the Session.send method

Group Validation

Currently the parser doesn't completely validate group messages, even with validation mode enabled.

Make the FIX Library only aware of Sessions

Session acquisition should be about sessions, and should be able to acquire sessions rather than connections. Users of the gateway should know enough information to be able to acquire sessions based upon sender and target comp id.

Connection ids might be remove-able from some Gateway protocol messages.

Expose the High Water Mark of sent messages

When a library or application has a restart or failover then whatever picks up the sessions in questions needs to know what has been sent to the client and what hasn't.

So we should be able to expose high water marks of what messages have been sent out via TCP. Probably best for this to be a periodic tick stream of update information.

Evaluate timeouts/failures for key framer thread operations

Some framer operations (for example acquiring a session and resetting the framer id) wait for the archival mechanism to update some state or catchup. They have idle strategy wait loops with no bounds. This could cause unbounded latency issues to critical path code or nasty gateway pauses?

Perhaps think about timeouts and failures or having a kind of duty cycle where these operations are checked and replied to?

Fixing this issue isn't critical as these aren't common operations.

Engine acceptor does not handle initiator disconnects gracefully

I start an acceptor and connect to it with a FIX test client. The FIX session is successfully established. I then kill the test client.

The server does not handle the resulting IOException in the socket channel read gracefully. I see an infinite loop of the following:

2015-09-09T17:44:08.539: java.io.IOException(An existing connection was forcibly closed by the remote host)
sun.nio.ch.SocketDispatcher.read0(SocketDispatcher.java:-2)
sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43)
sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
sun.nio.ch.IOUtil.read(IOUtil.java:192)
sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
uk.co.real_logic.fix_gateway.engine.framer.ReceiverEndPoint.readData(ReceiverEndPoint.java:143)
uk.co.real_logic.fix_gateway.engine.framer.ReceiverEndPoint.pollForData(ReceiverEndPoint.java:124)
uk.co.real_logic.fix_gateway.engine.framer.ReceiverEndPointPoller.pollEndPoints(ReceiverEndPointPoller.java:118)
uk.co.real_logic.fix_gateway.engine.framer.Framer.pollEndPoints(Framer.java:197)
uk.co.real_logic.fix_gateway.engine.framer.Framer.doWork(Framer.java:137)
uk.co.real_logic.agrona.concurrent.AgentRunner.run(AgentRunner.java:105)
java.lang.Thread.run(Thread.java:745)

Test against FIX 4.2

So far testing has primarily focused on FIX 4.4, we should evaluate behavioural differences with FIX 4.2 and enable acceptance tests in the fix integration project accordingly.

Intermittent failure of system test: ClusterReplicationTest

uk.co.real_logic.fix_gateway.replication.ClusterReplicationTest > shouldEstablishCluster FAILED
    java.lang.AssertionError: 1 and 3 disagree on leader expected:<-1125472483>
but was:<0>
        at org.junit.Assert.fail(Assert.java:88)
        at org.junit.Assert.failNotEquals(Assert.java:834)
        at org.junit.Assert.assertEquals(Assert.java:645)
        at uk.co.real_logic.fix_gateway.replication.ClusterReplicationTest.assertAllNodesSeeSameLeader(ClusterReplicationTest.java:263)
        at uk.co.real_logic.fix_gateway.replication.ClusterReplicationTest.checkClusterStable(ClusterReplicationTest.java:237)
        at uk.co.real_logic.fix_gateway.replication.ClusterReplicationTest.shouldEstablishCluster(ClusterReplicationTest.java:64)

Add ability to catchup for anything on the indexing thread

Currently it possible, albeit unlikely, to fail to resend a message if the resending process is far enough behind in indexing. The resender should have the ability to catchup with its indexing or process "in flight" messages for resending.

Productionise Latency Histogram logging

Currently we have a performance testing oriented latency histogram setup that can be enabled/disabled and prints out the latency histogram to the commandline.

We should dump out latency histograms periodically to a binary log that can be independently read/printed/monitored.

Why is transferTo causing memory mapped file allocation?

At the moment our use of transferTo in the archiver doesn't correspond to just a single system call, it also involves allocating memory mapped files and copying using them. We need to identify why the JDK is doing this and convince it to use the sendTo system call.

Make FIX Libraries modal

FIX Library instances should be configured to either be an acceptor or an initiator. The gateway should verify that only one acceptor has connected at any point in time.

Performance Analysis and Tuning

We need to establish that we're meeting our performance SLAs when benchmarking on non-resource constrained hardware. This includes adding any further monitoring in order to identify bottlenecks sources for SLAs.

Annotate generated code with @Generated

It would be nice to annotate the generated codecs classes with @javax.annotation.Generated. This will help stop IDEs from running code inspections on it.

Message Log Checksumming and Corruption Validation

At the moment we don't checksum components of the message log. This means that a disk failure or hard machine poweroff has the possibility of corrupting the message log without us being able to detect the corruption or ignore corrupted messages.

We should incrementally checksum the message log and be able to skip and report on corrupted log messages.

Should we do time validation?

Quickfix rejects connections if their time is outside of a certain latency band (both past and future). Should we also validate the sending time of messages?

Clustering and Reliability Support

We need to implement support for clustering a series of gateways in a reliable fashion. This includes:

  • Electing a leader out of a series of cluster members
  • Being able to have acknowledged writes across this cluster using the Raft concensus algorithm
  • Being able to have unacknowledged asynchronous replication of data across the cluster
  • Ability to perform administrative queries over the cluster to find out who is the leader etc
  • Testing the cluster in partitioned/timeout scenarios.

Unexpected interpretation of scale in DecimalFloat

The interpretation of scale in DecimalFloat.toString is unexpected and inconsistent with the way scale is used in classes such as BigDecimal.

I would expect expect new DecimalFloat(12345, 2) to have a string representation of "123.45". Instead it has "12.345", i.e. scale moves the decimal point from the left instead of from the right.

Split out the different SBE Schemas

We currently have the SBE schemas for the gateway messaging protocol, archive system and raft protocol all in the same message-schema.xml file. They should be decoupled into appropriate single-purpose schemas.

Configurable Sequence Number Resets

  • implement a per session configuration option that flips between persisting the sequences and resetting upon connect.
  • Add an api method in order to reset the sequence number.

See #8 for details of persistence over gateway restarts.

Implement a solution for the slow consumer problem

If a single FIX Connection is failing to read off its channel fast enough and is blocking sending to other clients then we should cutoff and disconnect the client in question. Also log why we cutoff this client.

Cannot launch FixEngine and FIxLibrary in same process with monitor file defaults

When I launch a FixEngine and FixLibrary instance without setting monitoringFile explicitly to different values the library will not initialise because it's trying to map the same file and, at least on Windows, I get this:

Exception in thread "main" java.nio.file.FileSystemException: C:\Users\43854743\AppData\Local\Temp\fix\monitoring: The process cannot access the file because it is being used by another process.

    at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
    at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
    at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
    at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269)
    at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
    at java.nio.file.Files.delete(Files.java:1126)
    at uk.co.real_logic.agrona.IoUtil.deleteIfExists(IoUtil.java:167)
    at uk.co.real_logic.fix_gateway.MonitoringFile.<init>(MonitoringFile.java:53)
    at uk.co.real_logic.fix_gateway.GatewayProcess.initMonitoring(GatewayProcess.java:49)
    at uk.co.real_logic.fix_gateway.GatewayProcess.init(GatewayProcess.java:42)
    at uk.co.real_logic.fix_gateway.library.FixLibrary.<init>(FixLibrary.java:75)
    at com.hsbc.efx.erisk.lfix.server.Server.boot(Server.java:37)
    at com.hsbc.efx.erisk.lfix.server.Server.main(Server.java:27)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)

The other process in this case is the current process. Should engine and library be able to share the same monitor file? If not, should the default monitor file path be different for EngineConfiguration and LibraryConfiguration?

Agree upon preferred API style

We need to agree upon the Parser API style.

Context
The parser is extracting information from a FIX message and needs to transfer that information to an application using the gateway. We'll be calling handlers which the application implements in order to transfer the information.

The question: what API style should the handlers use. There are two proposed options.

1. Generic Callback API
In this proposal there would be a single callback interface. You get notified when the message starts and a callback for each field that is parsed. Since FIX can contain repeating groups there also needs to be callbacks at the beginning of a group.

We've put together a simple sample of how an application might use the API. There are comments inline on the file to explain each step. There is also a sample acceptor which shows how you might use the generic callback based API.

Implementations could be registered against specific message types. The onField callback passes in a buffer, offset and length. We would provide a series of Flyweights over theses buffers which would offer specific functionality, for example, parsing a date.

Pros

  • Its very simple, and if you're just writing a message out in another format appropriately so.
  • The API is very generic.
  • Very flexible to missing fields or improper formatting.

Cons

  • Isn't really a very semantic API.
  • Nearly all handler call sites end up being Megamorphic in practice.
  • Potentially may result in a lot of dispatch code within the handler, for example a switch at every callback.

2. Dictionary Generated API

In this proposal there would be a callback interface, with a method on for each message type. A Decoder class would be generated for each message. Applications would implement these callback interfaces. and consume the decoder objects.

There's also a simple sample of how an application might use the API, mostly similar to the previous example. There is also a sample acceptor which shows how you might use this callback based API and is quite a bit different.

In order to allow users to configure the API for only the message types they are interested in we would generate these interfaces from a dictionary of message types and fields. users could customise the dictionary in order to ignore specific fields and elide the cost of parsing things like formatted dates. This follows the standard FIX XML based Dictionary format.

Pros

  • Handles the dispatch logic better.
  • Potentially faster because every callsite can be Monomorphic and the generation process allows for optimal code to be generated.
  • Follows the semantics and naming of the ubiquitous domain.

Cons

  • More complexity in the development process for us to produce.
  • Requires end user configuration of dictionary for optimal performance.

Implement the admin API

This also requires usage documentation and samples. Key features include:

  • -Monitoring sequence numbers of sessions-
  • enumerating libraries from the gateway
  • enumerating sessions from the gateway
  • be able to initiate a session starting at a given sequence number (resetting a session) and resetting sequence numbers of sessions

Generated message encoder/decoder wrong for two-character message types

For FIX message types that have two characters, the header.msgType("..."); code in the generated encoder constructor is incorrect. EncoderGenerator.generateConstructor directly converts the "packed" message type integer to a string. It needs to unpack first.

In the generated decoder the MESSAGE_TYPE_BYTES value is incorrect for the same reason.

The Message.toString method should probably also return the "proper" message type string.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.