Coder Social home page Coder Social logo

Comments (5)

Pike avatar Pike commented on May 21, 2024 3

I've looked at the implementation in #159, and I like it.

I like that we can relax the acceptance of the parser without reducing the readability of the spec.

I also agree that we should continue to serialize with indention.

I'd also stick to serializing with trailing-} indent for now. Mostly because that allows us to not immediately push all implementations to production ;-) I'm happy to have a separate conversation about that in a quarter or so, though.

from fluent.

zbraniecki avatar zbraniecki commented on May 21, 2024

I like it.

There's one thing I'm wondering about, both from the parser, and from the human eye perspective and it taps to the long standing debate about how the reader spots new message while scanning a file.

One opinion is that it happens when the reader sees a line starting with an identifier followed by = mark. Another is that it happens when the reader sees a line starting with an identifier.

Let's take a (slightly messy) example. Try to quickly count how many Messages are in the following sample:

this-is-my-key = This is my value
  .this-attribute1 = This is an attribute
this-is-another-key =
  This is another value for
  this-is-my-key and friends
this-is-a-third-key =
  This is a multiline message for
  which always has = after the message id
this-is-a-fourth-key = Yet another message
  with another line of text
this-is-a-fifth-key =
  .this-attribute1 = Yet another attribute

If you're in camp 1, you likely quickly scanned for = signs and backpaddled to see if the = is after an identifier which should be at the beginning of a line.
If you're in camp 2, you just tried to see an identifier character at the beginning of the line, and roughly assumed that it should be a beginning of the message.

I was always in camp 2, and it made me struggle to see the value of enforced = because I didn't use it in examples like that, but for FTL lists that had a lot of messages with just attributes and no value (like this-is-a-fifth-key) it brought noise.

Anyway, the reason I'm bringing it up is that in the current proposal, you relax the indentation rules around a different concept. One, that is not aligned with my human reading patterns despite several months of working with = enforced syntax.

From the extreme examples you provided:

key1 = {
$num ->
*[one]
 One
}

key2 =
.attr = Attribute

key3 = {
key1
}

key4 = {
DATETIME
(
$date,
weekday
:
"short"
)
}

key1 works because the line beginnings are $ and * which I know to be sigils, and the only line that could be confusing, conveniently starts with capital letter and a space. While a single space is not a lot, it's enough that combined with capital letter, I take a deeper look to "parse" the message and confirm that One is not a new key.

key2 works in a similar fashion - I see a . which tells me it's not a new message.

key3 is the one that really gets me. It feels like there's a broken message inside of a message. I understand that lack of = should help me, but it doesn't work at all.

key4 has mostly sigils to start lines except of all-caps DATETIME, which is not confusing mostly because it is all-caps and my brain don't associate message ids to be all-caps. The weekday line gets me again.

So, what did I learn about my processing? I learned that in order to mimic it in the syntax, we could say that a a-z after new line is a new message and forbid it anywhere else.
This means that:

  • error recovery is trivial. If it's a-z at the beginning of new line, it's a new message.
  • readability is high for all in camp2, but I assume that camp1 wouldn't have a problem with simpler recognition either
  • in syntax all we have to forbid is a-z showing up in patterns without indent.

While the logic of this proposal would change, the outcome would change only in a couple extreme cases:

  • The key3 example would require the variable to be indented
  • message indentation would only be required if the second+ line of the message starts with a-z.

As for caps A-Z we could either limit Identifiers to not start with them, or add them to the limitation. I haven't seen anyone trying to use caps in identifiers in a long time in any programming language, and since we want to use them in built-ins, I think it would be ok to not use them in ids.

I know my comment is long, but the proposal alteration is not that significant. Just instead of relax the indentation requirement for everything other than text content we'd use relax the indentation requirement for everything except of identifier-start.

wdyt?

from fluent.

stasm avatar stasm commented on May 21, 2024

Thanks for sharing your thoughts, @zbraniecki!

Why do you think that this should be enforced by the grammar, rather than by linting and style guides? I honestly don't suppose anyone will write their FTL the way key4 is written. It's an extreme example of what's possible but definitely not of what's recommended. It's similar to how other languages operate: if you write obfuscated and minified JS by hand, you don't expect people to instantly get it.

We can't protect users from their sense of aesthetics (or the lack thereof). If, as a localizer or developer, you want readable translation files, Fluent should give you every tool you need to do so. But you can be a bit sloppy, too, and things should mostly work, too. That's the Tolerance design principle at work.

we'd use relax the indentation requirement for everything except of identifier-start.

This is much harder to implement in practice. Or at least I'm not sure how to go about it. The way I'm thinking about the relaxation allows the parser to switch into a completely whitespace-insensitive mode as soon as it enters a placeable. The way you propose it, there would still be restrictions that apply depending on what's inside of the placeable.

Also, would you allow {\n-3.14} or require {\n -3.14}?

from fluent.

zbraniecki avatar zbraniecki commented on May 21, 2024

My only point is that error recovery (find the next start of the message) which partially informs the decision here, can (and I argue that it should) be aligned with human readability - because both, the parser and the human are tasked with the same task - "find the start of the message".

And what we're trying to do here is find what we have to forbid in order for that task to be easy/possible.

As I said, I'm not interested in blocking your proposal - just provided an alternative. :) I'm happy to go with your approach.

from fluent.

stasm avatar stasm commented on May 21, 2024

Thanks, @zbraniecki. I went ahead and merged the PR.

from fluent.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.