Comments (7)
In Europe the NAP model has solved this issue in a handful of countries so far. The political effort that went into clarifying the source of truth in each of these is commendable and a huge achievement. It might be helpful to provide guidance on how that has been achieved places like Austria, Netherlands, and Norway.
Meanwhile many NAPs including Germany are hosting numerous overlapping datasets. Other entire NAPs are offline or lacking transit data entirely. In addition to what @evansiroky shared, this all seems to indicate we shouldn't only depend on centralized management being achieved universally, either from a regulatory or resourcing standpoint.
from transit.
I don’t know about calling this a best practice as there are factors that may not make this the recommended course of action, and I’m skeptical that advocating for the use of an agency’s official URL would do much to guarantee more stability. Whether for organizational or funding reasons, sometimes an agency’s URL does not match their name (e.g., “ECCOG” vs “Outback Express”), agencies rebrand and change their name or URL, procure new websites or merge with others. An agency may choose not to publish GTFS with their agency domain for a variety of reasons. Trillium publishes feeds at data.trilliumtransit.com, oregon-gtfs.com, etc., many for small agencies or cities that don’t have the capability to publish data at their own domain or their website content management system might pose barriers (e.g., an unavoidable automation that changes the suffix every time a new file is uploaded). Establishing the use of agency domains as a best practice seems a bit restrictive given the breadth of circumstances an agency might be under that steer them toward a different approach. I can understand advocating for as much agency control as possible over how and where their data is published, but that doesn’t really seem to be the topic of this discussion.
Looking at the user story…
…avoid having to update our database of which URL to download a transit agency’s feeds from,
So that I can consistently download each transit agency’s most up-to-date data even if they change their internal GTFS publishing process
The direct factor in avoiding having to constantly update a database’s fetch URLs is simply that those URLs don’t change, regardless of what the domain might be. But this is already a best practice; those agencies (and vendors) with constantly changing URLs are just not following it. Apart from these cases, though, URLs still change for a variety of legitimate reasons. So is there an alternative to mitigate this pain point other than by creating an additional best practice? Perhaps this is where the establishment of something like https://database.mobilitydata.org/ as a single source of truth could come into play…?
I would also be interested to hear from other consumers on this pain point.
from transit.
Does not make sense to me. Since the agency does not have to be the initiator of the GTFS publication in the first place. What you want is called a "National Access Point" where dataproviders are mandatory to register their dataset with the available metadata, works well in Europe, makes sence in the rest of the world.
from transit.
+1 This will be very useful!
from transit.
Since the agency does not have to be the initiator of the GTFS publication in the first place.
That's why it would be best practice not a requirement. I agree that this would be useful not only for URL stability issues but also for many of the issues I've heard about making sure to be using the "official" schedule that the agency wants you to...since there are quite a few that have more than one floating around.
from transit.
I would like to avoid having to update our database of which URL to download a transit agency's feeds from,
So that I can consistently download each transit agency's most up-to-date data even if they change their internal GTFS publishing process.
My reasoning is that your user-story should be resolved in a better way, not by scraping agency websites.
from transit.
Hello @skinkie. Thanks for your feedback. In this case my organization (The State of California) maintains a similar thing as you describe as a national access point. We maintain our own list of GTFS datasets which we publish here: https://data.ca.gov/dataset/cal-itp-gtfs-ingest-pipeline-dataset/resource/e4ca5bd4-e9ce-40aa-a58a-3a6d78b042bd
We manually maintain those URLs as best as we can because this our only option at this time. We frequently run into issues of having outdated data because we are not required to be notified by the transit agencies when they update their data. While there is now a mandate in our country for most transit agencies to provide their URLs, they are only mandated to do so for GTFS Schedule data and not realtime. They also only report this once a year at most to a federal agency. Furthermore, the availability of these URLs from the federal agency is something we are uncertain whether we will have access to.
At this time, the creation of a mandate to have URLs reported is outside of our control. And even if there were a mandate, there may be transit agencies that forget to provide their most up-to-date URL when they change vendors. Or they may only report it as a requirement once a year thus creating a potentially large gap of time between when they change their URLs. And on top of that, there may be an additional gap between when the data is reported and when the agency that the URL is reported to makes the reported URLs available other organizations such as ours.
Given all of this, I still recommend creating this best practice to aid with feed aggregators and entities producing a national access point, but also for direct data consumers as well.
from transit.
Related Issues (20)
- Why is it recommeded that short term service modifications are excluded from GTFS? HOT 4
- [GTFS-Fares v2] Non-sequential Legs Transfer HOT 2
- stops.zone_id conditional requirement with presence of route-based fare_rules? HOT 3
- Integration of carpooling lines HOT 5
- Clarification on language code data standards used in translations.txt HOT 2
- [Governance] Phase 2: Enhancing Voting and Reviews HOT 16
- Clarifying constraints on pathways.stair_count HOT 3
- Missing functionality to define "conceptual grouping of stops/stations" in existing GTFS HOT 14
- Refinement of GTFS Terminology: Transitioning from "Schedule" to "Static" HOT 20
- Make UTF-8 the mandatory GTFS encoding HOT 6
- GTFS Fares 2.0: Manage fare change HOT 2
- Moving Realtime Best Practices into the Spec: Phasing Plan
- [DRT] After the adoption of GTFS-Flex, stops.txt should no longer be a required file. HOT 1
- Using StopTimeEvent.uncertainty for non-timepoints HOT 4
- Addition of vehicles.txt to GTFS static HOT 1
- Make Shapes a recommended file in GTFS HOT 10
- Make bikes_allowed a recommended field in GTFS HOT 6
- Global trip id HOT 4
- The recommended discussion
- Proposed Best Practice: always including trip_id in TripDescriptor for SCHEDULED trips HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transit.