Coder Social home page Coder Social logo

Comments (17)

barbeau avatar barbeau commented on May 14, 2024 2

After talking with someone who has more experience managing .protos long-term, I suggest that we deprecate expired experimental fields using [deprecated=true] - so the field would look like:

EXPERIMENTAL_FIELD_THAT_NOBODY_WANTED = 1 [deprecated=true];

Here's why:

  1. It would allow any existing consumers or producer that have implemented support to leave code as-is. If we used RESERVED instead, consumers or producers would need to remove implemented code as the field would no longer be available the next time the stub code was generated by the protocol buffer compiler. With [deprecated=true], the stub code may flag the field as deprecated in code annotations, but it would still be usable.
  2. It allows easier discovery of past experimental fields when someone wants to implement something similar (again) in the future.

The above would allow us to easily "un-deprecate" a field as well. For example, let's say that the 2 years expired for an experimental field, and several consumers implemented support but no one is publishing the data yet. The field would get marked as [deprecated=true] in the .proto, but early adopting consumers could leave code that supports the feature in their product. Then, if a publisher started publishing the data, we could re-adopt the field easily, again without the early adopting consumers needing to change anything. And then, having the requisite producer/consumer, we could vote to officially adopt the field.

The primary downside to using [deprecated=true] instead of RESERVED is that the .proto would be more cluttered over time if old expired experimental fields pile up. But I think the above benefits outweigh the negatives.

Thoughts?

from transit.

gcamp avatar gcamp commented on May 14, 2024

@barbeau I removed it because I wanted to clean up the timeline of your PR but that didn't work and couldn't revert. That's why it's not showing up. 😬

I would make the time window for experimental field of 1 year. Hopefully most field should take less time than this. Nothing else significant to add. Strong majority vote still make sense to me.

from transit.

barbeau avatar barbeau commented on May 14, 2024

Currently, the GTFS-realtime change process doesn't require producers and consumers for adoption of proposed fields as experimental OR production fields.

@gcamp As part of these changes, what would you think of introducing a requirement for an actual producer and consumer before an experimental field could be officially adopted as a production field? This would align GTFS-realtime with GTFS.

from transit.

gcamp avatar gcamp commented on May 14, 2024

@barbeau I would agree in principle but in practice we've had big problems integrating producers into the discussion on Github. I've had that with 2 producers who wanted changes on the spec but didn't follow on Github nor put energy on it. (#87 and #111)

Maybe that's a symptom of a bigger problem with the producers we are working with, so I would agree.

from transit.

barbeau avatar barbeau commented on May 14, 2024

@gcamp Ok, thanks. One other question here - after the expiration windows passes, and assuming an experimental field doesn't get adopted, how exactly does it get deprecated?

So, given a field like:

EXPERIMENTAL_FIELD_THAT_NOBODY_WANTED = 1

...it looks like our options are, in increasing order of severity:

  1. EXPERIMENTAL_FIELD_THAT_NOBODY_WANTED = 1 [deprecated=true];
    • Protobuf docs say “code may be annotated as @Deprecated by pb compilers” - I interpret this as just being a hint to the tools, but in theory the field could still be used.
  2. RESERVED 1;
    • Protobuf docs say “The protocol buffer compiler will complain if any future users try to use these field identifiers.” Is this a compiler build failure? How exactly does it "complain"?

Any thoughts based on experience with protobufs are welcome!

from transit.

gcamp avatar gcamp commented on May 14, 2024

I would prefer option 1., because as much as some field might not be necessary, it would be nice to have the option to revert on using EXPERIMENTAL_FIELD_THAT_NOBODY_WANTED to actual use it.

from transit.

abyrd avatar abyrd commented on May 14, 2024

I agree with @barbeau that we should require at least one producer and one consumer for any new feature before it is formally adopted. This is what keeps the spec compact and focused on the concrete domain of passenger information, and prevents ideas from rushing ahead of implementations.

I can imagine as @gcamp said it's difficult to get producers to engage in discussion on Github. But I don't see how this affects the requirement. Requiring real-world implementation before formally modifying the spec seems to be a separate requirement than including the producer in the design discussion.

I am also in favor of leaving fields in the experimental phase for a significant period of time until we can judge whether they are being adopted by multiple parties.

I have no preference on the experimental field deprecation mechanism. It seems to me that both methods would allow the field to be re-activated in a future version of the spec in the event that the abandoned field is revived a few years later.

from transit.

jakluk avatar jakluk commented on May 14, 2024

Sadly I can't provide any specific suggestions on how to deal with the process, but from my viewpoint the main problem with producers is that they're mostly busy with keeping their live feeds running. I can believe that in most agencies the feeds are maintained by one or two people, whose main responsibilities are completely different. In my case it's solely my free time activity - I was interested in eventually implementing #87, but it was only last week when we were finally ready to publish our data to the world since communication with the internal bus tracking system is not the easiest one.

It means that it's quite demanding to implement experimental changes into the feeds, keep track of changes during the approval process and distribute the extension proto files to the consumers. Another problem is to find consumers who are actually willing to work with the protobuf format, for many developers interested only in data from one agency it's much easier to process plain JSON which they're used to. Extensions or experimental fields just bring complications not only for them.

So basically I guess that our only consumers of protobuf format would be Google Maps and eventually a few more services with worldwide coverage, who can also take quite some time to implement the changes.

from transit.

abyrd avatar abyrd commented on May 14, 2024

Thanks for your comments @jakluk. I do see what you and @gcamp mean about the difficulty of getting producers to contribute to a discussion on the specification or modify their feeds.

However, I don't really understand how lack of time or resources on the producer side constitutes an argument for or against the requirement to have at least one working producer and consumer before changes to the spec are finalized. It seems only tangentially related to me.

If someone wants the spec to forge ahead quickly with ways to represent all sorts of situations, ahead of people actually implementing them, then they might find that the one-producer-one-consumer requirement slows down progress. But this is precisely the goal of the requirement - to prevent the spec from evolving at a faster rate than existing practice. It is common for people and organizations to generate ideas much faster than they allocate resources to actually understand and test them. The world already has plenty of exhaustive, voluminous data models that no one uses. GTFS benefits from avoiding that path.

from transit.

gcamp avatar gcamp commented on May 14, 2024

@abyrd My experience so far with producers has been that they would be willing to implement something only when it's accepted in the spec, even if it's something that they are asking for. Basically their time is so sparse (like @jakluk said), that they don't want to spend time implementing something that might be rejected later on.

On top of that, the complexity of having two .proto files simultaneously is quite high, especially for consumer but also for producers. You'll need two proto files to create one compliant GTFS-rt and one experimental GTFS-rt.

from transit.

abyrd avatar abyrd commented on May 14, 2024

I imagine that's probably the case for most producers - they can't pay external contractors or vendors to build things that may be thrown away. It's completely legitimate for them to wait. But there are a few large or forward thinking agencies with their own internal technical capacity, who often write their own GTFS export or processing code. These agencies already add nonstandard fields to their feeds and push for changes to the spec, and GTFS was basically created and popularized by such agencies.

We won't have that much trouble getting a useful extension added to a feed from TriMet, the Netherlands, or maybe NYC or Boston (among other regions). It might take six months or even longer, but prototyping, testing, and revising a new proposal has a cost (in both effort and money), and until someone decides to take on that cost I think it's prudent to consider proposals as provisional.

Ideally the reference implementations that justify inclusion in the spec should be open source, which would cut the cost of follow-up implementation by other parties.

from transit.

paulswartz avatar paulswartz commented on May 14, 2024

Here @mbta, we have some internal extensions, but we only publish them in JSON formats. While the space savings for protobuf files are substantial, there's added complexity to make sure that all clients who want a new field are using the custom .proto definition. It's also hard to make changes to an existing field in a protobuf, even an existing experimental field.

I don't have any good ideas here, unfortunately. Maybe a requirement to have a working producer and a consumer but not necessarily as a part of a protobuf? We're much more likely to experiment with a field in our JSON data than as something to add to our PB files.

from transit.

barbeau avatar barbeau commented on May 14, 2024

@paulswartz Just to clarify, the current GTFS-realtime change process encourages producers/consumers to add new fields to the main .proto instead of implementing custom extensions, as long as the field seems generally useful, because of implementation issues. Just wanted to make sure that was clear for anyone following the discussion.

from transit.

barbeau avatar barbeau commented on May 14, 2024

Proposal for defining GTFS-realtime experimental field consensus process based on this discussion is at #126.

We can tackle the limited window on experimental fields separately.

from transit.

abyrd avatar abyrd commented on May 14, 2024

Here @mbta, we have some internal extensions, but we only publish them in JSON formats. While the space savings for protobuf files are substantial, there's added complexity to make sure that all clients who want a new field are using the custom .proto definition. It's also hard to make changes to an existing field in a protobuf, even an existing experimental field.

This is an interesting point. Protocol buffers are an interesting technology for their compactness, the ability to generate bindings in multiple languages, etc. But if they're getting in the way of people adding fields and evolving the spec then maybe they're not the best choice. General purpose compression like gzip is remarkably efficient on text containing lots of repetitive keys and punctuation like JSON. There is a significant additional cost of expanding the data into a giant string and parsing that string, but I'm not sure how often GTFS-RT needs to be consumed directly by low-powered embedded systems.

I don't have any good ideas here, unfortunately. Maybe a requirement to have a working producer and a consumer but not necessarily as a part of a protobuf? We're much more likely to experiment with a field in our JSON data than as something to add to our PB files.

GTFS-RT does not currently mention or allow replacing Protobuf with alternative serialization formats, so realtime data in a non-protobuf format is not GTFS-RT. It is exactly this extra effort of implementation and adoption within Protobuf-based GTFS-RT, which is known to be more complicated, that we would be awaiting before the specification is expanded to include a new field.

If implementation is so prohibitive that no one is willing to do it, then either:
a) New features are not important or useful enough for anyone to invest in creating and maintaining them, so should not exist; or
b) The Protobuf system is too inflexible for an evolving specification so GTFS-RT should be changed to use some more flexible text-based format (as GTFS-static does); or
c) We need to reassure producers by pre-approving new fields, stating that they will definitely become part of the spec as soon as they are implemented.

The third option might be useful: rather than voting to add an experimental field and voting again to add it to the spec once it's implemented, we could vote only once to include the experimental field in the spec automatically on some future date, conditional on it being implemented by a specific producer and consumer who have volunteered to do so. It would be up to those producers and consumers to coordinate among themselves to get the job done by the deadline, at which point the experimental field would automatically be part of the spec. This lessens the risk for the implementers.

from transit.

barbeau avatar barbeau commented on May 14, 2024

Proposal for producer/consumer requirement, limited time for experimental fields opened at #140.

from transit.

barbeau avatar barbeau commented on May 14, 2024

#140 is now approved and merged, and #126 was previously approved and merged, so this should cover the topics discussed in this issue. Thanks all!

from transit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.