datafibers-community / df_data_service Goto Github PK
View Code? Open in Web Editor NEWDataFibers Data Service
Home Page: http://www.datafibers.com
License: Apache License 2.0
DataFibers Data Service
Home Page: http://www.datafibers.com
License: Apache License 2.0
Flink SQL should be able to transform Avro data directly
Need to add such feature in UI and test the new transformation FLINKA2J
From @datafibers on October 3, 2016 18:44
Able to launch Flink UDF Jar from DF
Copied from original issue: datafibers/df_data_service#8
readthedocs.org seems to have official way for documentation.
We may consider to release our final doc in this format
From @datafibers on October 11, 2016 1:42
Unit Test case as part of code bases
Travis file update for CI
Copied from original issue: datafibers/df_data_service#11
Integrate Kafka REST API for topics
From @datafibers on October 12, 2016 13:57
CLI need to support run single DF application meaning both rest and ui are in the same host and same port with different url
after #21
Copied from original issue: datafibers/df_data_service#22
Need to work on the prototype of DF API framework for UDF
One of value added feature is to enable permission on data sending and consuming from Kafka topics. Since we use Mongo DB, we can enable it there. However, the current design it to commit in mongo once the Kafka forward is complete, I think we need to check this.
In addition, user login and permission need to be discussed here as well
From @datafibers on October 11, 2016 1:51
DF needs a Kafka topic view for better management
Copied from original issue: datafibers/df_data_service#19
We need to automatically add all jars in the folder to start DF env
From @datafibers on October 12, 2016 13:57
CLI need to support run single DF application meaning both rest and ui are in the same host and same port with different url
after #21
Copied from original issue: datafibers/df_data_service#22
There are following issues for schema view
From @datafibers on October 11, 2016 1:42
New integration test case needed for demo/developer environment
Copied from original issue: datafibers/df_data_service#12
Create Avro Table Sink for Flink
From @datafibers on October 11, 2016 1:45
DF needs a shortcut visualization approach, such as MongoDB, and/or HBase, Zeppelin
Copied from original issue: datafibers/df_data_service#14
Add the documentation tasks for the monthly release
Guess 1. inputTopic_stage is reused in Flink transformation. When the select statement is updated. New data is write to the same topic. New consumer will face data having different schema of json, so failed.
Generic file sink is needed to sink Avro data to
Rethinkdb vs. mongodb
This will be a sub task for #32
Use docker image to build DF developer environment
Use docker composite to build DF developer service
From @datafibers on September 22, 2016 0:33
After version 1.2.0, Apache Flink will offer REST API for monitoring purpose. Then, DF Processor can leverage this new service to synchronize the active job status like what we did for Kafka connects
Also, we can fetch the job log information to repo.
Copied from original issue: datafibers/df_data_service#2
From @datafibers on October 12, 2016 13:47
Delete Jar/UDF uploaded with transforms when you want to delete a transform. Default is to delete.
Copied from original issue: datafibers/df_data_service#20
We need to look at the ways of getting following things
From @datafibers on October 11, 2016 1:50
Need to show schema in ui
Need to edit&update schema in ui
Need to create new schema in ui
Can delete as well?
when we use below function to map different default value to the same field entity, all enties are sent. We see both connectorConfig and connectorConfig_2 are in the json body
myApp.config(function(RestangularProvider) {
RestangularProvider.addElementTransformer('posts', function(element) {
element.connectorConfig_2 = element.connectorConfig;
return element;
});
});
Currently, the jobId is captured from console screen in the separate thread. It is observed that exceptions on the console output stream sometimes. The proposed change may come from following solutions.
From @datafibers on October 5, 2016 12:49
Support Flink savepoints, which are manually triggered checkpoints, enabled in Flink transforms.
Copied from original issue: datafibers/df_data_service#9
We need a way to track the history of data ingestion
We need to create Environment setup Scripts with following features
Use a scheduler to trigger df client by scanning schedule setting in mongo
Reference
https://github.com/Coreoz/Wisp
https://github.com/diabolicallabs/vertx-cron
We need specific connector instead of generic connector sink or source as well as configurations.
From @datafibers on September 28, 2016 19:38
Need to support to deal with Avro type of data if Kafka Connect enable
key.converter.schemas.enable=true
key.converter.schemas.enable=true
Copied from original issue: datafibers/df_data_service#5
We are looking for better UI framework if possible
The options come in mind are as follows.
From @datafibers on October 11, 2016 1:48
Evaluate Gradle and move to it from maven
Copied from original issue: datafibers/df_data_service#16
From @datafibers on October 11, 2016 1:47
Copied from original issue: datafibers/df_data_service#15
We need to show a view to map schema to topic so that we know which topic using what schema.
This can be added to the topic management view
#24
We also need to identify list of data attribute, physical or logic/business to keep in mongodb/schema registery
Currently, DF use tasks. We need to have a job view in repository
we can add a new command option -a = admin tool to support calling admin tools, such as
From @datafibers on October 1, 2016 12:35
"topic.for.query" only supports single topic. Make it supports list of the topics.
Are we able to support list of the topics as "topic.for.result"?
Copied from original issue: datafibers/df_data_service#7
We need stop/pause/resume/cancel the job by ad-hoc or schedule
This should be done in jobConfig
Create Avro Table Source for Flink
HelpFun. errorMsg() is used for reporting errors. We need polish its usage and do documentation for error standard as follows in df complete guide.
error_id, error_class_name, error_method_name, error_details
From @datafibers on October 13, 2016 14:28
From @datafibers on August 10, 2016 1:20
The agent is hanging there without receiving responses from server. Once restart the agent, it works again. In this case, it is not an issue at the server side.
Copied from original issue: datafibers/df_demo#5
Copied from original issue: datafibers-community/df_demo#1
It is same to #6, but focus on the DF service level not UI changes
We need to research the possibility of supporting processing data in parquet format since it is more efficient and support schema evolution
From @datafibers on October 12, 2016 13:48
We need more wiser way to read command line parameters,
jar -cluster s
Copied from original issue: datafibers/df_data_service#21
Right now, connect is imputed by user. However, it must be unique. In addition, we need an additional attribute can be referenced in connect for meta information delivery.
Since we integrate topic and schema (subject), we need to try
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.