Coder Social home page Coder Social logo

transitland / distributed-mobility-feed-registry Goto Github PK

View Code? Open in Web Editor NEW
15.0 4.0 1.0 258 KB

a JSON-based data schema to catalog mobility/transit/transportation data feeds

Home Page: https://dmfr.transit.land/

gtfs gtfs-realtime gbfs mds transit mobilitydata transitland

distributed-mobility-feed-registry's Introduction

Distributed Mobility Feed Registry (DMFR)

Introduction

This is a set of guidelines for data publishers providing machine readable lists of their feeds and for data aggregation platforms providing machine readable lists of their feed contents to each other. This project is rooted in publishing and sharing lists of GTFS feeds for fixed-route public-transit networks. It's also applicable to real-time transit, bike-share, e-scooter, and other mobility datasets that take the form of "feeds" published at stable URLs:

Goals

  1. Publishers provide their own small registries To provide data creators (e.g., transit agencies and data vendors) a means of posting a list of their public feeds online. The format should be light-weight (no server required to power an API). The registry should also be machine readable, making it simple for data aggregation platforms to automatically recognize and consume newly added feeds.
  2. Aggregator platforms share their registries To provide data aggregation platforms (e.g., Transitland, OpenMobilityData, Navitia) a means of sharing their feed registries with each other. Each platform may have a particular focus in terms of functionality provided on top of their feed registries. By distributing feed lists among any and all platforms, open data is shared my widely and the burden of data curation is (hopefully) reduced for each platform.
  3. Related feeds are linked Different feed types reference each other (e.g., GTFS-realtime references a static GTFS feed, an MDS e-scooter feed references a GBFS bike-share feed). This registry format will provide a light-weight means for data publishers and aggregator platforms to identify these linkages.
  4. Put it into practice and experiment
  • The more contributors to these guidelines, the better! Let's consider many options and discuss the pros/cons of each of the registry specifics. Let's also be pragmatic. Our goal at Transitland will be to implement this registry format for both incoming feed submissions (to complement the existing Transitland Feed Registry add a feed process) and outputting lists of known feeds (the Datastore API feeds endpoint).
  • DMFR now powers the new Transitland Atlas, which is the source of truth for both Transitland v1 and Transitland v2's Feed Registry.

Related Work

The stands on the shoulders of:

Basic Examples

Single static GTFS feed:

{
  "feeds": [
    {
      "spec": "gtfs", // enum: ["gtfs", "gtfs-rt", "gbfs", "mds"]
      "id": "XXXX", // IDs are internally unique, but not necessarily globally unique
      "urls": { // "Transitland style URL" to support nested zip archives
        "static_current": "",
        "static_historic": [""],
        "static_planned": [""]
      },
      "languages": ["en-US"], // IETF language tags, see https://tools.ietf.org/html/bcp47
      "license": { // license covering the contents of the feed
        "spdx_identifier": "", // see https://spdx.org/licenses/
        "url": "",
        "use_without_attribution": "yes", // enum: ["yes", "no", "unknown"]
        "create_derived_product": "yes", // enum: ["yes", "no", "unknown"]
        "redistribute": "yes", // enum: ["yes", "no", "unknown"]
        "attribution_text": "",
      }
    }
  ],
  "license_spdx_identifier": "CC0-1.0" // license covering the DMFR file itself; see https://spdx.org/licenses/
}

Single GTFS-realtime feed:

{
  "feeds": [
    {
      "type": "gtfs-rt", // enum: ["gtfs", "gtfs-rt", "gbfs", "mds"]
      "id": "XXXX", // unique ID for this feed record; may be a Onestop ID or your own ID scheme
      "urls": {
        "realtime_vehicle_positions": "",
        "realtime_trip_updates": "",
        "realtime_alerts": ""
      },
      "languages": ["en-US"], // IETF language tags, see https://tools.ietf.org/html/bcp47
      "license": {
        "spdx_identifier": "", // see https://spdx.org/licenses/
        "url": "",
        "use_without_attribution": "yes", // enum: ["yes", "no", "unknown"]
        "create_derived_product": "yes", // enum: ["yes", "no", "unknown"]
        "redistribute": "yes", // enum: ["yes", "no", "unknown"]
        "attribution_text": "",
      }
    },
    {
      "type": "gtfs", // enum: ["gtfs", "gtfs-rt", "gbfs", "mds"],
      "id": "XXXX", // unique ID for this feed record; may be a Onestop ID or your own ID scheme
      // ...
    }
  ],
  "license_spdx_identifier": "CC0-1.0" // required to meet this spec
}

Group together multiple feeds using an operator:

{
  "$schema": "https://dmfr.transit.land/json-schema/dmfr.schema-v0.3.0.json",
  "feeds": [
    {
      "spec": "gtfs",
      "id": "f-9q9-bart",
      "urls": {
        "static_current": "http://www.bart.gov/dev/schedules/google_transit.zip"
      },
      "license": {
        "url": "http://www.bart.gov/schedules/developers/developer-license-agreement",
        "use_without_attribution": "yes",
        "create_derived_product": "unknown",
        "redistribute": "yes"
      },
      "tags": {
        "gtfs_data_exchange": "airbart"
      }
    },
    {
      "spec": "gtfs-rt",
      "id": "f-bart~rt",
      "urls": {
        "realtime_alerts": "http://api.bart.gov/gtfsrt/alerts.aspx",
        "realtime_trip_updates": "http://api.bart.gov/gtfsrt/tripupdate.aspx"
      }
    }
  ],
  "license_spdx_identifier": "CDLA-Permissive-1.0",
  "operators": [
    {
      "onestop_id": "o-9q9-bart",
      "tags": {
        "us_ntd_id": "90003",
        "omd_provider_id": "bart",
        "wikidata_id": "Q610120",
        "twitter_general": "sfbart",
        "twitter_service_alerts": "SFBARTalert"
      },
      "name": "Bay Area Rapid Transit",
      "short_name": "BART",
      "associated_feeds": [
        {
          "feed_onestop_id": "f-bart~rt"
        },
        {
          "feed_onestop_id": "f-9q9-bart"
        }
      ]
    }
  ]
}

Fields

IDs

Feed IDs can be any strings that are unique with a given DMFR file. These feed IDs can be Onestop IDs, although that is not required by the DMFR spec. In the Transitland Atlas repository, DMFR files are required to use Onestop IDs.

Extended URLs

For static feeds contained in a zip archive, ideally the feed files are all in the root directory of the archive. However, this is not always the case.

Transitland Feed Registry supports an extended URL format that can reference files nested within a subdirectory. The extended URL format can also reference a zip file nested within another zip file.

https://github.com/septadev/GTFS/releases/download/v201810010/gtfs_public.zip#google_bus.zip

Optional Stanzas

License

Based on Transitland's approach to handling open data licenses in all their variety.

      "license": {
        "spdx_identifier": "", // see https://spdx.org/licenses/
        "url": "",
        "use_without_attribution": "yes", // enum: ["yes", "no", "unknown"]
        "create_derived_product": "yes", // enum: ["yes", "no", "unknown"]
        "redistribute": "yes", // enum: ["yes", "no", "unknown"]
        "attribution_text": "",
      }

Authentication

Requiring authentication for public data feeds is typically not a good idea. However, it's reasonable to require an API key for a GTFS-realtime endpoints and other feeds that involve active queries.

    "authorization": {
      "type": "", // enum: ["header", "basic_auth", "query_param"]
      "param_name": "",
      "info_url": ""
    }

Tags

Tags allow extra information to be added to feeds and operators. Keys and values must both be strings.

  "operators": [
    {
      "onestop_id": "o-9q9-bart",
      "tags": {
        "us_ntd_id": "90003",
        "omd_provider_id": "bart",
        "wikidata_id": "Q610120",
        "twitter_general": "sfbart",
        "twitter_service_alerts": "SFBARTalert"
      }
    }
  ]

distributed-mobility-feed-registry's People

Contributors

drewda avatar irees avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

polymath-is

distributed-mobility-feed-registry's Issues

feed auth can require registering an IP address with a transit agency

For example:

    {
      "id": "f-foothilltransit~rt",
      "spec": "gtfs-rt",
      "urls": {
        "realtime_vehicle_positions": "https://foothilltransit.rideralerts.com/myStop/GTFS-Realtime.ashx?Type=VehiclePosition",
        "realtime_trip_updates": "https://foothilltransit.rideralerts.com/myStop/GTFS-Realtime.ashx?Type=TripUpdate",
        "realtime_alerts": "https://foothilltransit.rideralerts.com/myStop/GTFS-Realtime.ashx?Type=Alert"
      },
      "license": {
        "url": "http://foothilltransit.org/about/developer-resources/developer-license-agreement/"
      },
      "authorization": {
        "type": "ip_address",
        "info_url": "http://foothilltransit.org/about/developer-resources/"
      }
    }

how to specify unzipped GTFS feeds?

How do I mark up GTFS feeds not stored one .zip archive?

Serving the individual .csv/.txt files via HTTP has great benefits, including following web semantics & efficiency (caching, downloading only necessary files).

I currently run vbb-gtfs.jannisr.de as a "special treatment" for my need to work with Berlin data, but I'd like to a) follow best practices to make the "endpoint" as reusable as possible, and b) make it as portable as possible by using semantic markup such as DMFR/DACT-AP/DataPackage.

evaluate DCAT (Data Catalog Vocabulary)

@LeoFrachet has suggested an interesting comparison: the DCAT (Data Catalog Vocabulary) spec, which is used by open data catalogs: https://www.w3.org/TR/vocab-dcat/

Rather than reinvent a new format to describe mobility-specific feeds, why not use this more general purpose format to describe datasets that are connected to each other? It's an interesting question and worth considering.

A particular follow-up question is how DCAT can be us in JSON: ckan/ckanext-dcat#146

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.