delhivery / moirai Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 3.19 MB

Path prediction for Delhivery Transportation Network

CMake 2.00% C++ 97.69% Python 0.31%

moirai's People

Contributors

Stargazers

Watchers

moirai's Issues

Support for Ad-Hoc routes

Is your feature request related to a problem? Please describe.
Some routes in the system are generated on a one time basis. Additionally, only objects at this route's source are expected to be routed via this ad-hoc route.

Atropos should add support so that it can consume one time routes and generate new routing solutions for applicable objects.

Describe the solution you'd like
Expose a separate endpoint to consume ad-hoc routes. For each ad-hoc route, identify list of objects available at route source.
Copy the current graph and add the ad-hoc route to the copy. For each identified object, request routing against the forked graph. Store this data against its own namespace so as not to muddle standardized routing.

Graph isn't updated on changes to routing or node configuration

Describe the bug
While atropos listens for changes to route/node configurations, these changes aren't propagated to the underlying graph thus leading to stale/incorrect outcomes.

To Reproduce
Steps to reproduce the behavior:

Start atropos
Add a new route to the system
Request route for an object which is expected to use this new route
The generated route doesn't use this route at all.

Expected behavior
The generated route should use the new route in its solution.

Processing time as a node configuration

Is your feature request related to a problem? Please describe.
Currently, processing time against nodes is stored against a flat file. Use third party API to build interfaces to configure processing time at nodes so that the node stream can directly provide this information and there is a manageable and single source of truth for all nodes.

Describe the solution you'd like
Use the third party API to store inbound/outbound and cutoff times for each node. Build embeddable interfaces against these so that the node service can directly overlay them in its own UI.

Date based indices for elasticsearch

Is your feature request related to a problem? Please describe.
Currently, data is pushed to elasticsearch in a singular index. This leads to constantly increasing data size on the cluster. A date based rolling index would allow for a timely deletion strategy of old dataset.

Describe the solution you'd like
Moirai should postfix requested index with creation date thus creating rolling indices based on creation date. Additionally, it should also implement a index lifecycle policy if not existant to drop indices older than current_day - 2.

Describe alternatives you've considered
An alternate would be to delete records from elasticsearch as part of clotho as per lifecycle of bags/shipments. However, this approach is ineherently more expensive on the ES cluster.

Parallelism V1: Support for vertical scaling via threaded solver

Is your feature request related to a problem? Please describe.
Currently, the solver works concurrently for the payload in the queue. This leads to added costs in terms of thread initialization and deinitialization, and increases linearly with number of tasks/threads to spawn.

We should instead move to a solution where number of threads is tied to hardware concurrency and each thread solves for a batch of requests sequentially.

Describe the solution you'd like

Get number of available logical cores (N) via std:🧵:hardware_concurrency
Spawn 2N + 1 threads, with one thread reserved for load consumer, one for writer while the other for multiple instances of solver.

Parallelism V1

Is your feature request related to a problem? Please describe.
This is a meta issue to track progress of parallelism in moirai. The current approach does not scale vertically or horizontally.

Describe the solution you'd like

Support for vertical scaling via threaded solver
Support for horizontal scaling via synchronized kafka consumers

Parallelism V2: Support for horizontal scaling via synchronized kafka consumers

Is your feature request related to a problem? Please describe.
Currently, the kafka listener listens across partitions for loads. We should add the ability to spawn multiple processes each managing their own offsets and syncrhonization mechanism so that they do not overlap with each other. Additionally, each consumer should attempt to pick a separate partition first before competing with another solver process to read messages on the same partition.

Describe the solution you'd like

Kafka consumer should be partition aware and report this information back to kafka server
Before attaching a consumer, check for unassigned partitions. If available, use it as the partition to listen to.
For messages that are consumed, only publish acknowledgement once a path is generated

Routing requests should be auto-generated for stale information

Describe the bug
Currently, routing requests are only consumed for objects that generate an explicit scan. For objects that were recommended on a route but weren't routed as per the system, the proposed route becomes stale since no scan is generated on idle items.

To Reproduce
Steps to reproduce the behavior:

Request a route for an object
For the proposed route, depart the route without adding the object to it
The route for the object stays unchanged despite its infeasibility (proposed route is stale)

Expected behavior
When the proposed route is departed without the corresponding object associated with it, a automatic request for a new route should be generated either by matching against the departed route information or using a scheduler that re-generates routes automatically at the time of proposed departure.

False positives for invalid routes

Describe the bug
Some routes in the system are detected as invalid (non-existant source/target), even though the sources/targets are valid and active in the system.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior
A clear and concise description of what you expected to happen.

[Kafka] Multi-sink support for outputs

Is your feature request related to a problem? Please describe.
Currently, Atropos directly pushes its output against Elasticsearch cluster. While this is good enough to start consuming data, it leads to overwriting of older paths with newer ones for the same object on future routing requests against it.

Support for multiple-sinks would allow us to route this to Kafka where it can be consumed by Data Warehouse or other services that directly want to create their own local replica.

Describe the solution you'd like
Writer should support factory pattern to write to as many sinks as the user requests. This would allow us to reuse the same writer to write to Elasticsearch/Kafka and any additional sinks if required (such as HTTP/Redis PUBSUB etc)

delhivery / moirai Goto Github PK

moirai's People

Contributors

Stargazers

Watchers

moirai's Issues

Support for Ad-Hoc routes

Graph isn't updated on changes to routing or node configuration

Processing time as a node configuration

Date based indices for elasticsearch

Parallelism V1: Support for vertical scaling via threaded solver

Parallelism V1

Parallelism V2: Support for horizontal scaling via synchronized kafka consumers

Routing requests should be auto-generated for stale information

False positives for invalid routes

[Kafka] Multi-sink support for outputs

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent