Coder Social home page Coder Social logo

google / data-transfer-project Goto Github PK

View Code? Open in Web Editor NEW
3.5K 188.0 480.0 10.17 MB

The Data Transfer Project makes it easy for people to transfer their data between online service providers. We are establishing a common framework, including data models and protocols, to enable direct transfer of data both into and out of participating online service providers.

Home Page: http://datatransferproject.dev

License: Apache License 2.0

Shell 1.05% TypeScript 1.22% JavaScript 0.08% HTML 0.26% Java 97.30% SCSS 0.09%
data transfer portability data-portability

data-transfer-project's Introduction

Data Transfer Project

Overview

The Data Transfer Project makes it easy for people to transfer their data between online services. We provide a common framework and ecosystem to accept contributions from service providers to enable seamless transfer of data into and out of their service.

Who We Are

Data Transfer Project (DTP) is a collaboration of organizations committed to building a common framework with open-source code that can connect any two online service providers, enabling a seamless, direct transfer of data.

We want all individuals across the web to be in control of their data.

More info

Current State

Data Transfer Project is in its early stages, and we are actively looking for partner organizations and individuals to contribute to the project. We are continuing to define the architecture and implementation. Since the code is in active development, please do thorough testing and verification before implementing.

Contact Info

Please contact [email protected] with any questions or comments.

About Us

The Data Transfer Project was formed in 2017 to create an open-source, service-to-service portability platform so that all individuals across the web could easily move their data between online service providers.

The partners in the Data Transfer Project believe portability and interoperability are central to innovation. Making it easier for individuals to choose among services facilitates competition, empowers individuals to try new services and enables them to choose the offering that best suits their needs.

We anticipate the Data Transfer Project solution will make a particularly big impact in global markets where downloading or uploading data is expensive and/or slow. The Data Transfer Project eliminates the need to download data at all. Instead, data is transferred directly between service providers.

DTP is early-stage open source code that is built and maintained entirely by DTP community members.

data-transfer-project's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

data-transfer-project's Issues

Build out unit tests

Right now we have 1 unit test, that is pretty terrible, we need to expand our test coverge

Limit OAuth scope requests to specific verticals

Right now we ask for all Oauth scopes that might be needed for a particular service.

Instead we should ask for just the scope needed.

e.g. if just exporting photos from Google we shouldn't ask for mail permissions, or any kind of write permision.

Bug Bounty Program

Figure out how to either set up our own bug bounty program, or hook into existing MS/Google bug programs, and advertise that fact in our docs.

Travis Angular test is flaky

ng test --single-run --no-progress --browser=ChromeNoSandbox --environment local
24 01 2018 22:52:08.810:INFO [karma]: Karma v1.4.1 server started at http://0.0.0.0:9876/
24 01 2018 22:52:08.813:INFO [launcher]: Launching browser ChromeNoSandbox with unlimited concurrency
24 01 2018 22:52:08.822:INFO [launcher]: Starting browser Chrome
24 01 2018 22:52:51.057:INFO [Chromium 62.0.3202 (Ubuntu 0.0.0)]: Connected on socket onhuHgqvEOP4J9oqAAAA with id 56551463
24 01 2018 22:53:01.059:WARN [Chromium 62.0.3202 (Ubuntu 0.0.0)]: Disconnected (1 times), because no message in 10000 ms.
Chromium 62.0.3202 (Ubuntu 0.0.0) ERROR
Disconnected, because no message in 10000 ms.

Configure some type of linter on PRs

Options:

  1. Use the gradle checkstyle plugin with a copy of the Google checkstyle configuration

  2. Use linthub - havent looked too much at this since it requires giving read/write permissions to the github repo.

  3. Rely on the google java linter as an extension in intelliJ: https://github.com/google/google-java-format

  4. Others?

I think a combination of 1 and 3 is the way to go, this way we can add it either as a step in our commandline build script, or as a step in our travis configuration - quick gist of setup: https://gist.github.com/seehamrun/5fa0d2d0f00a8ce9930f28a6042a0da1

Note that currently there are lotsa errors if I run this - I havent played with the extension yet, so I'm not sure if it will suggest fixes, but the command line output of the gradle plugin is really hard to read and pin down whats wrong/needs fixing, but my guess is it will be better with the extension (gunna try it out and will update)

Modularize web server

Right now we are using the built in Java web server, this probably isn't the ideal.

We should make the web server implementation modular so that
a) we can replace it with a more scalable one easily
b) Other implementations can easily be swapped in.

IntelliJ reports error with duplicate AutoValue generated classes

Run ./gradle clean build
Run IntelliJ - build/rebuild project

Error:(9, 8) java: duplicate class: org.dataportabilityproject.spi.cloud.types.AutoValue_LegacyPortabilityJob
Error:(9, 8) java: duplicate class: org.dataportabilityproject.spi.cloud.types.AutoValue_JobAuthorization
Error:(9, 8) java: duplicate class: org.dataportabilityproject.spi.cloud.types.AutoValue_PortabilityJob

Work around is to run ./gradlew clean (not build) before IntelliJ build or tests

GCP: Add Stackdriver setup to GCP/GKE setup script for #41

We will provide a generic Java interface for metrics writing for #41. For GCP, we will hook this up to write to Stackdriver.

Stackdriver includes logging and monitoring agents for VMs. This monitors some metrics for free such as CPU utilization.

We should script the creation of a Stackdriver account (or point to an existing one) in our GCP project setup script

We should also add a VM startup script:
https://cloud.google.com/compute/docs/startupscript

and include Stackdriver agents in the VM start up script:

To install the Stackdriver monitoring agent:

$ curl -sSO https://repo.stackdriver.com/stack-install.sh
$ sudo bash stack-install.sh --write-gcm

To install the Stackdriver logging agent:

$ curl -sSO https://dl.google.com/cloudagents/install-logging-agent.sh
$ sudo bash install-logging-agent.sh

Add metrics

Different developers/hosts may want different libraries/platforms for metrics

We should probably make this an interface and include a GCP implementation to start

Introduce Gradle build and module structure

This will involve quite a bit of work that will be done in steps via pull requests.

The first step will be to create a parallel project structure using Gradle that we can discuss and iterate on. Initially, this structure will contain the principal modules as well as a limited number of interfaces and types.

As the structure stabilizes and people are happy with it, we will make the Gradle modules available through the Maven build so that the new interfaces can be implemented throughout the codebase.

When the existing code has been migrated we can do a cut-over.

Tracker for frontend client changes

General catch-all tracker for any front end client changes. Any front end changes require a new static content deployment to cloud (build script under /config/gcp/build_and_deploy_static_content.sh) as well as a new binary build and rollout.

Modularize Logging

Right now we have LogUtils.

We should change that to an interface and put implementations in the cloud providers.

Race condition: 2 workers poll the same job

API:
2018-01-26 20:49:36,196 [http-server-16] INFO org.dataportabilityproject.job.PortabilityJobFactory - Creating new PortabilityJob, id: ed03f0ba-0b8c-4d55-bce0-70c9c747f35e

Worker 1:

2018-01-26 20:50:12,244 [JobPollingService RUNNING] DEBUG org.dataportabilityproject.worker.JobPollingService - Polled pollForUnassignedJob, found id: ed03f0ba-0b8c-4d55-bce0-70c9c747f35e
2018-01-26 20:50:12,943 [JobPollingService RUNNING] DEBUG org.dataportabilityproject.worker.JobPollingService - Completed updateJobStateToAssignedWithoutAuthData, publicKey: 162

2018-01-26 20:50:33,051 [JobPollingService RUNNING] DEBUG org.dataportabilityproject.worker.JobPollingService - Polled lookupAssignedWithAuthDataJob, found auth data, id: ed03f0ba-0b8c-4d55-bce0-70c9c747f35e
2018-01-26 20:50:33,056 [main] DEBUG org.dataportabilityproject.worker.WorkerImpl - Begin processing jobId: ed03f0ba-0b8c-4d55-bce0-70c9c747f35e
Jan 26, 2018 8:50:33 PM com.google.common.util.concurrent.UncaughtExceptionHandlers$Exiter uncaughtException
SEVERE: Caught an exception in Thread[main,5,main]. Shutting down.
java.lang.RuntimeException: Failed to invoke public org.dataportabilityproject.shared.auth.AuthData() with no args
at com.google.gson.internal.ConstructorConstructor$3.construct(ConstructorConstructor.java:107)
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:162)
at com.google.gson.Gson.fromJson(Gson.java:803)
at com.google.gson.Gson.fromJson(Gson.java:768)
at com.google.gson.Gson.fromJson(Gson.java:717)
at com.google.gson.Gson.fromJson(Gson.java:689)
at org.dataportabilityproject.worker.WorkerImpl.deSerialize(WorkerImpl.java:108)
at org.dataportabilityproject.worker.WorkerImpl.processJob(WorkerImpl.java:92)
at org.dataportabilityproject.worker.WorkerImpl.processJob(WorkerImpl.java:69)
at org.dataportabilityproject.worker.WorkerMain.main(WorkerMain.java:31)
Caused by: java.lang.InstantiationException
at sun.reflect.InstantiationExceptionConstructorAccessorImpl.newInstance(InstantiationExceptionConstructorAccessorImpl.java:48)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.google.gson.internal.ConstructorConstructor$3.construct(ConstructorConstructor.java:104)

Worker 2:

2018-01-26 20:50:12,382 [JobPollingService RUNNING] DEBUG org.dataportabilityproject.worker.JobPollingService - Polled pollForUnassignedJob, found id: ed03f0ba-0b8c-4d55-bce0-70c9c747f35e
Jan 26, 2018 8:50:12 PM com.google.common.util.concurrent.UncaughtExceptionHandlers$Exiter uncaughtException
SEVERE: Caught an exception in Thread[main,5,main]. Shutting down.
java.lang.IllegalStateException: Expected the service JobPollingService [FAILED] to be TERMINATED, but the service has FAILED
at com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:328)
at com.google.common.util.concurrent.AbstractService.awaitTerminated(AbstractService.java:293)
at com.google.common.util.concurrent.AbstractScheduledService.awaitTerminated(AbstractScheduledService.java:434)
at org.dataportabilityproject.worker.WorkerImpl.pollForJob(WorkerImpl.java:75)
at org.dataportabilityproject.worker.WorkerImpl.processJob(WorkerImpl.java:60)
at org.dataportabilityproject.worker.WorkerMain.main(WorkerMain.java:31)
Caused by: java.lang.IllegalArgumentException: Job not found
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
at org.dataportabilityproject.job.JobDao.updateJobState(JobDao.java:224)
at org.dataportabilityproject.job.JobDao.updateJobStateToAssignedWithoutAuthData(JobDao.java:173)
at org.dataportabilityproject.worker.JobPollingService.pollForUnassignedJob(JobPollingService.java:83)
at org.dataportabilityproject.worker.JobPollingService.runOneIteration(JobPollingService.java:57)
at com.google.common.util.concurrent.AbstractScheduledService$ServiceDelegate$Task.run(AbstractScheduledService.java:188)
at com.google.common.util.concurrent.Callables$4.run(Callables.java:122)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)

Checked or Unchecked exceptions

The issue of checked vs. unchecked exceptions has been raised on Slack. Based on the outcome of that discussion, changes will need to be made to harmonize the codebase.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.