Coder Social home page Coder Social logo

datasphere-oss / datasphere Goto Github PK

View Code? Open in Web Editor NEW
4.0 3.0 4.0 121.6 MB

DataSphere is the first open-source cloud-native data observability platform that helps you trace the whole data infrastructure in your warehouses, lakes and databases.

Java 100.00%
datasphere daas data-governance data-analytics data-management data-lake warehouse cloud-native datamesh data-observability

datasphere's Introduction

DataSphere

DataSphere is an open-source cloud-native data observability platform that helps you build digital infrastructure in your warehouses, lakes and databases.

DataSphere

The Cloud Native DaaS platform for Everyone



What is DataSphere?

DataSphere aimed to be an open source cloud-native DaaS platform that helps you build digital infrastructure in your warehouses, lakes and databases.

Getting Started

Roadmap

DataSphere is currently in developing and is not ready to be used in production, Roadmap 2021

License

DataSphere is licensed under Apache 2.0.

datasphere's People

Contributors

theseusyang avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

datasphere's Issues

Roadmap


description: 'Here''s what''s coming in the next few days, weeks, months, and years!'

Roadmap

Coming within a few days

Check out our Roadmap for Core on GitHub. You'll see the features we're currently working on or about to. You may also give us insights, by adding your own issues and voting for specific features / integrations.

Coming within a few weeks / months

We understand that we're not "production-ready" for a lot of companies yet. In the end, we just got started in May 2019, so we're at the beginning of the journey. Here is a highlight of the main features we are planning on releasing in the next few months:

Landing in December 2021 or so:

  • Release of DataSphere Portal Service GA.
  • Release of DataSphere Catalogue Service beta.
  • Release of DataSphere Migration Service beta.
  • Support of most popular databases as both sources and destinations.
  • Support of data lakes, including SnowFlake, HashData, Greenplum, GaussDB, ClickHouse, DataFuse, Hadoop .

Coming a bit later until July 2022 :

  • Release of DataSphere MDM Service beta.
  • Release of DataSphere Management Service beta.

Our goal is to become "production-ready" for any company whatever their data stack, infrastructure, architecture, data volume, and connector needs.

Coming within a few quarters / years

We also wanted to share with you how we think about the high-level roadmap over the next few months and years. We foresee several high-level phases that we will try to share here.

1. Parity on data consolidation (ELT) in warehouses / databases

Our first focus is to support batch-type ELT integrations. We feel that we can provide value right away as soon as we support one of the integrations you need. Batch integrations are also easier to build and sustain. So we would rather start with that.

Before we move on to the next phase, we want to make sure we are supporting all the major integrations and that we are in a state where we can address the long tail, with the help of the community.

We also want to fully integrate with the open-source ecosystem, including Airflow, dbt, Kubernetes, GreatExpectations, etc., so teams have the ability to fully build the data infrastructure they need.

2. Reverse-ETL from warehouses / databases

Some integrations we have in mind are batch distribution integrations, from warehouses to third-party tools. For instance, a use case could be if your marketing team wants to send back the data to your ad platforms, so it can better optimize the campaigns. Another use case could be syncing the consolidated data back to your CRM.

It’s not yet clear in our minds when to prioritize those additional integrations. We will have a better idea once we see the feedback we get from the community we build with data consolidation.

**3. Parity on data catalogue in warehouses / databases **

we will consolidate all the data types into data lakes, and build a data catalogue to help you search and analyse your data resources.

3. Cross-cloud data assets management to realize distributed data-business synergy

we will build a data consistence infrastructure to achieve data migration and synchronization cloud input and output.

4. Expand on all data engineering features

This is when we will start differentiating ourselves in terms of feature coverage with current cloud-based incumbents. Being open-sourced enables us to go faster, but also deeper.

elasticsearch is not started!

None of the configured nodes are available: [{#transport#-1}{0wyZwS2gRXOPgg0818DxKQ}{localhost}{127.0.0.1:9300}]
at org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(TransportClientNodesService.java:352)
at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:248)
at org.elasticsearch.client.transport.TransportProxyClient.execute(TransportProxyClient.java:57)
at org.elasticsearch.client.transport.TransportClient.doExecute(TransportClient.java:394)
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:396)
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:385)
at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.execute(AbstractClient.java:1225)
at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.exists(AbstractClient.java:1241)
at org.datasphere.mdm.search.service.impl.MappingComponentImpl.indexExists(MappingComponentImpl.java:535)
at org.datasphere.mdm.search.service.impl.MappingComponentImpl.process(MappingComponentImpl.java:182)
at org.datasphere.mdm.search.service.impl.SearchServiceImpl.process(SearchServiceImpl.java:601)
at org.datasphere.mdm.core.service.impl.IndexAuditStorageService.prepare(IndexAuditStorageService.java:104)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
at org.datasphere.mdm.core.module.CoreModule.start(CoreModule.java:206)
at org.datasphere.mdm.system.service.impl.ModuleServiceImpl.loadModule(ModuleServiceImpl.java:308)
at org.datasphere.mdm.system.service.impl.ModuleServiceImpl.loadModuleDependencies(ModuleServiceImpl.java:383)
at org.datasphere.mdm.system.service.impl.ModuleServiceImpl.loadModule(ModuleServiceImpl.java:257)
at org.datasphere.mdm.system.service.impl.ModuleServiceImpl.lambda$loadModules$3(ModuleServiceImpl.java:223)
at java.base/java.util.HashMap$Values.forEach(HashMap.java:976)
at org.datasphere.mdm.system.service.impl.ModuleServiceImpl.loadModules(ModuleServiceImpl.java:223)
at org.datasphere.mdm.system.service.impl.ModuleServiceImpl.loadModules(ModuleServiceImpl.java:158)
at org.datasphere.mdm.system.service.impl.PlatformStarterImpl.onApplicationEvent(PlatformStarterImpl.java:72)
at org.datasphere.mdm.system.service.impl.PlatformStarterImpl.onApplicationEvent(PlatformStarterImpl.java:47)
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:172)
at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:165)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:139)
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:404)
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:361)
at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:898)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:554)
at org.springframework.web.context.ContextLoader.configureAndRefreshWebApplicationContext(ContextLoader.java:401)
at org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:292)
at org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:103)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4714)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5177)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:717)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:690)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:706)
at org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:1184)
at org.apache.catalina.startup.HostConfig$DeployDirectory.run(HostConfig.java:1925)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75)
at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
at org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:1094)
at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:476)
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1611)
at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:319)
at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:123)
at org.apache.catalina.util.LifecycleBase.setStateInternal(LifecycleBase.java:423)
at org.apache.catalina.util.LifecycleBase.setState(LifecycleBase.java:366)
at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:936)
at org.apache.catalina.core.StandardHost.startInternal(StandardHost.java:843)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1384)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1374)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at org.apache.tomcat.util.threads.InlineExecutorService.execute(InlineExecutorService.java:75)
at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:140)
at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:909)
at org.apache.catalina.core.StandardEngine.startInternal(StandardEngine.java:262)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
at org.apache.catalina.core.StandardService.startInternal(StandardService.java:434)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
at org.apache.catalina.core.StandardServer.startInternal(StandardServer.java:930)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
at org.apache.catalina.startup.Catalina.start(Catalina.java:772)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:342)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:473)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.