Coder Social home page Coder Social logo

onyx-platform / onyx Goto Github PK

View Code? Open in Web Editor NEW
2.0K 2.0K 206.0 16.61 MB

Distributed, masterless, high performance, fault tolerant data processing

Home Page: http://www.onyxplatform.org

License: Eclipse Public License 1.0

Clojure 99.72% Shell 0.28%
batch clojure data distributed streaming

onyx's People

Contributors

bamarco avatar brianh avatar bridgethillyer avatar codonnell avatar colinhicks avatar danielcompton avatar davidrupp avatar devth avatar gardnervickers avatar greywolve avatar jqmtor avatar lbradstreet avatar leathekd avatar malcolmsparks avatar malesch avatar mariusz-jachimowicz-83 avatar metasoarous avatar michaeldrogalis avatar mushketyk avatar nathants avatar oleschoenburg avatar owengalenjones avatar schmee avatar solatis avatar sundbry avatar superstevenz avatar the-alchemist avatar tvanhens avatar vijaykiran avatar yonatane avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

onyx's Issues

Throw an exception on bad catalog format

Submitting a catalog that doesn't conform to the specification of the informational model throws an unhelpful assertion inside the Coordinator. The catalog should be validated using something a library like Schema to obtain helpful error messages.

Virtual peers can be starved on Grouping tasks

This is a particularly rough edge case. If a virtual peer receives a grouping task, it's capable of being starved from receiving the sentinel segment off the queue due to the way that HornetQ pins messages as it groups. If a consumer closes out, it might not necessarily requeue the sentinel in a server node where other consumers can reach it. Hence, the other virtual peers may deadlock and wait forever. This only affects batch mode - streaming mode is fine.

Standardize naming of lifecycle event map

There's been some confusion around what the difference between the "event map", "lifecycle event map", and "context map" are. They are all the same thing. This should be fixed in the docs. I think I'd like to choose "lifecycle event" as the canonical term.

Alternate peer balancing strategy

As of 0.3.0, the only strategy for balancing peers across jobs and tasks is round robin/breadth-first, respectively. This ticket should break out the algorithms used for planning and coordination into functions behind multimethods, and allow for a greedy strategy. A greedy strategy will try to complete an entire job before moving on to the next.

Any task that follows a sequential task emits an exception

With any task proceeded by a sequential task, the following exception will be thrown:

org.hornetq.api.core.HornetQInternalErrorException: HQ119000: ClientSession closed while creating session
    type: #<INTERNAL_ERROR>
org.hornetq.core.client.impl.ClientSessionFactoryImpl.createSessionInternal      ClientSessionFactoryImpl.java:  782
        org.hornetq.core.client.impl.ClientSessionFactoryImpl.createSession      ClientSessionFactoryImpl.java:  366
                               sun.reflect.NativeMethodAccessorImpl.invoke0      NativeMethodAccessorImpl.java      
                                sun.reflect.NativeMethodAccessorImpl.invoke      NativeMethodAccessorImpl.java:   57
                            sun.reflect.DelegatingMethodAccessorImpl.invoke  DelegatingMethodAccessorImpl.java:   43
                                            java.lang.reflect.Method.invoke                        Method.java:  606
                                clojure.lang.Reflector.invokeMatchingMethod                     Reflector.java:   93
                           clojure.lang.Reflector.invokeNoArgInstanceMember                     Reflector.java:  313
                                            onyx.queue.hornetq/eval20966/fn                        hornetq.clj:  201
                                                clojure.lang.MultiFn.invoke                       MultiFn.java:  231
                                       onyx.peer.operation/start-lifecycle?                      operation.clj:   55
                                           onyx.peer.transform/eval21156/fn                      transform.clj:   95
                                                clojure.lang.MultiFn.invoke                       MultiFn.java:  231
                    onyx.peer.task-lifecycle-extensions/merge-api-levels/fn      task_lifecycle_extensions.clj:   19
                                             clojure.lang.ArrayChunk.reduce                    ArrayChunk.java:   63
                                                  clojure.core.protocols/fn                      protocols.clj:   98
                                                clojure.core.protocols/fn/G                      protocols.clj:   19

The virtual peer will shutdown and instantly reboot, continuing as normal. This bug is mostly harmless. It is causes by the concurrent optimizations set for the HornetQ configuration. The Session Factory is swapped out in favor of a different factory, but the new factory doesn't "stick" for new tasks. Virtual peers reuse the old Session Factory that has been closed. The exception is thrown. After reboot, a fresh Session Factory is used.

Harmless, but annoying to see in the logs.

Task that can complete when only 1 input has been exhausted

Considering a feature that will let a task complete when only one (not all) of its upstream inputs have pushed the sentinel onto the input stream. This would aid use cases where a privileged kill stream is utilized.

Just a placeholder, needs more thought.

Virtual peers can fail to make progress at end of task

Reproduced with the grouping test in Onyx core by turning up the number of virtual peers. Observed that two peers can continually take the sentinel segment off the queue and re-enqueue it infinitely, neither of them able to complete the task.

Maximum peers per task option

The catalog should off an optional onyx/max-peers parameter that takes an integer value representing the maximum number of peers that may be executing an instance of that task at any single point in time.

Plugin request: HDFS

This plugin should be capable of reading off the Hadoop file system and writing segments back to it. The point of input and partitioning should be a single file, and the partitioning will happen over the byte sequence representing the file distributed over blocks in the cluster.

Optimize aggregators

Aggregators are at a disadvantage, performance-wise, to transformers and groupers. Aggregators can only hold a single session open which needs to be reused across pipeline iterations in the peer. The reason for this is that if multiple sessions were used, all sessions needs to be read from at the same time since groupers will pin particular message ids to consumers. These sessions shouldn't be closed, otherwise the messages will be repinned. Further, once the sentinel is read, all other sessions will block indefinitely.

The goal of this issue is to speed up this aggregators using an alternate design approach.

Peer can deadlock on task completion

Reproduced in core.async plugin tests with a high number of virtual peers. Sometimes, closing a peer will block as it tries to flush its pipeline. The pipeline will block on reading from an ingress queue. This queue should always provide the sentinel value. Something is hanging on to the sentinel as a consumer and never committing it back to the queue, hence the hang.

Restart peer on failure

When a peer dies, it should attempt to recover by rebooting itself. See core.async test for failure.

Monitoring dashboard

This issue serves as a placeholder for the creation of another repository - onyx-dashboard. This dashboard will serve as a point of monitoring the status of what's happening inside Onyx by querying ZooKeeper. The data in ZooKeeper is immutable, and compressed with Fressian.

Implement kill-job core API function

There needs to be an API function that takes a job ID and halts any peer execution of that job's tasks. The job's tasks will no longer be eligible for execution.

Add :onyx/params to catalog entry

Allow value-level parameterization through the catalog.

[{...
  :my/param 42
  :my/other-param 44
  :onyx/params [:my/param :my/other-param]}]

Run Jepsen tests against Onyx

It would be pretty great to Jepsen both HornetQ and Onyx itself. Partitioning virtual peers and coordinators fit the bill nicely.

Reintroduce task timeouts

In a very early version of Onyx, if the batch size of messages didn't accrue within a certain period of time, Onyx wouldn't attempt to keep reading and time out. This was removed due to a bug in HornetQ that didn't preserve sequential ordering. This is useful for sparse message streams, so I have found a workaround to add this back in.

Throw an exception on bad workflow format

Submitting a workflow that doesn't conform to the specification of the informational model throws an unhelpful assertion inside the Coordinator. The workflow should be validated using something a library like Schema to obtain helpful error messages.

Java API for Onyx

As of release 0.3.0, Clojure is the only supported language for Onyx. Java users can use the APIs that Clojure offers to tap some of the Onyx functionality, but this becomes problematic for areas such as lifecycle extensions that rely on implementations of multimethods.

Furthermore, EDN isn't the friendliest cross-language data format to send catalogs and workflows through. Part of this issue should explore options that Java users have on this front.

Coordinator logging

The Coordinator logs very infrequently as of 0.3.0. Logging using Dire should be implemented on events like job submission, task completion, peer birth/death, etc.

Update masterless examples

Some of the examples are incorrect due to design changes, or require pictures for better explanation.

Plugin request: Kafka

A Kafka plugin should be created that offers both input and output functionality. Additionally, it should be capable of working with Kafka partitions.

Support full DAG workflows

In 0.4.0, we're going to move away from the tree/map based workflow to a vector-of-vectors. This will properly support multi input streams to any task, and continue to support multiple output streams. It will look like this:

[[:in-1 :inc]
 [:in-2 :inc]
 [:in-3 :inc]
 [:inc :out]]

Tasks:

  • Function to convert map to DAG to preserve backward-compatibility
  • Read from HornetQ queue's using Go blocks
  • Reserve space in peer local atom to cache sentinel value's that have been seen
  • Expand ZooKeeper sentinel reduction process to work across multiple input queues
  • Alter planning to construct queue chains based off DAGs
  • Intercept map at job submission time and conver to DAG
  • Ignore input channels that are known to be exhausted
  • Alternate input priority with round robin
  • Change aggregate to read like transform does
  • Alter validation to allow for DAG workflow
  • Update documentation about map and vector workflows
  • Update documentation to remove ack thread
  • Update documentation to specify that read-batch must return a map
  • Remove status check from pipeline

Validate workflow DAG

As mentioned in #2, workflow should be validated such that only input tasks are missing incoming edges, only output tasks should be missing output edges, and DAG should not have any cycles (dependency will throw an exception for you when creating the graph).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.