Comments (7)
What is your preference?
I was planning to use flex
's start conditions to handle this, but I can understand why context-sensitive stuff could be inconvenient.
I would like to reorganize and augment your list of solutions with some other ideas:
- solutions that work without changing the current grammar (I see all of those choices as implementation details left to the choice of the developer, although I agree parsing would be easer if it did not require advanced features or ugly hacks.)
- context-sensitive scanner (e.g.
flex
's start conditions) - scan everything between
Acceptance:
and the nextHEADER:
or--BODY--
as a single token, then call a separate parser. This is tricky and makes error reporting a pain. - scanning [IF] as well as [iF][0-9]+ as some specific token types for use in the Acceptance: grammar rule, and adjusting the implementation of the other rules to allow these token in addition to identifiers and decoding [IF][0-9]+ only when appropriate in the parser. (I'm not calling those token keywords, because this solution is just one way to implement the given grammar.)
- context-sensitive scanner (e.g.
- solutions that change the grammar
- require space between
[FI]
and[0-9]+
, makingF
andI
keywords, and fixing the specification to also allowF
andI
everywhereIDENTIFIER
is used. This makes the implementation details discussed in the previous solution explicit in our document, somehow forcing the implementation. The only win is that the number can decoded in the scanner. - replace
F
andI
by characters that cannot be mistaken as identifiers (what about+
and-
?) - Force acceptance set numbers to be enclosed in braces, as in
Acceptance: 3 F{0} & (F {1} | F!{2})& I{2})
. This way the scanner can match[IF][[:space:]]*!?{[0-9]+}
as a token, and this wont conflict with an identifier. As braces are also used to specify acceptance sets in automata, there is some homogeneity.
- require space between
I like the last two solutions better.
For t
and f
I would either:
- leave the grammar as is, maybe with a warning that
t
andf
are also valid identifiers (dealing with this doesn't really seem very difficult) - declare that
t
andf
are BOOLEAN constants, cannot be used as identifiers, but can appear in header lines as inHEADERNAME (INT|STRING|IDENTIFIER|BOOLEAN)*
. - state that
t
andf
can only be used in guard as(f)
or(t)
, and have the scanner recognize those tokens instead. This looks clumsy if we consider aliases as well. (I don't like this.)
from hoaf.
For t
and f
, my preference would be solution 2 (BOOLEAN
).
For the acceptance condition, I have a preference for 2.ii (+0
, -!2
), even though it doesn't look that pretty. I'm not a fan of multiple tokenization (such as F{!0}
with separate tokenization afterwards), as this restricts the places where comments or whitespaces can be placed. But I wouldn't object to needing a context-sensitive lexer, if there is a preference for allowing F0 and I0 for aesthetic reasons.
There is another question that occured to me just now: What should be the interpretation of
acc-name: Ra--ABORT--
?
Is this just an identifier? I would be fine with that, as one can prepend a blank before --ABORT--
to abort an arbitry stream without risking to be inside an identifier. Another option would be to switch to ==BODY==
, ==END==
and ==ABORT==
, then there is no ambiguity with identifiers. Either way, I don't think there is a way to insert a --ABORT--
into a non-parsed stream at an arbitrary location, as the stream could be in the middle of a comment or "quoted string"
. But that's probably not a big deal.
from hoaf.
Branch issue/32 contains two patches marking t
/f
as BOOLEAN and stating that --ABORT--
should be a separate token.
So we are left with the F
and I
stuff. I agree +
and -
aren't pretty. We should probably also aim for a solution that allow future extensions. For instance what if in the future we want to add support for "occurrence acceptance" where a transition or state has to be visited at least once?
How about this: use F
and I
as functions with parentheses F(0) & (F (1) | F(!2))& I(2))
. The grammar would use identifiers instead of F
or I
:
acceptance-cond ::= IDENTIFIER "(" "!"? INT ")"
| (acceptance-cond)
| acceptance-cond & acceptance-cond
| acceptance-cond | acceptance-cond
Our document would only specify the meaning of the identifiers F
and I
.
from hoaf.
I like the F(0) & (F (1) | F(!2))& I(2))
syntax, that's a good, extensible solution.
In regard to IDENTIFIER
: ([a-eg-su-zA-Z_][0-9a-zA-Z_-]*|[tf][0-9a-zA-Z_-]+)
, I would prefer keeping the old regexp and just stating that t
and f
are not identifiers. That's usually easy to ensure by the order of the lexical rules for the lexer.
from hoaf.
OK, I've reworked these patches not to change the IDENTIFIER
regexp, and also changed all F(x)
and I(x)
. Can you check those changes and see if you agree?
from hoaf.
Great, I fixed a small typo, otherwise this looks find. I would then as a next step extract the example automata in an examples
subdirectory, so we can use them for testing parser implementations. Ok?
from hoaf.
Thanks for the proof reading. I've put those patches on master and deleted the branch.
from hoaf.
Related Issues (20)
- better encoding for parity acceptance HOT 6
- constraints between acc-name and Acceptance HOT 6
- on the semantics of parity acceptance HOT 10
- the definition of F as a set of sets sounds incorrect HOT 2
- HOA logo HOT 9
- add "semi-deterministic" property HOT 25
- Support for non-AP alphabets HOT 4
- Add Alphabet header for specifying an alphabet not based on atomic propositions HOT 6
- Properties: Add syntactic support (or convention) for negation of properties HOT 5
- Formalize the semantics of the version number HOT 19
- Edges or transitions? HOT 4
- ε-labelled transitions HOT 13
- generalized Streett? HOT 4
- fixing the semantic of the deterministic property HOT 1
- Figures referenced from examples in README.md are not working HOT 2
- transition-based acceptance marks in alternating automata HOT 3
- Not clear whether acceptance sets can be specified both on a set and some of its exit arcs HOT 3
- No clear way to specify dead-end states HOT 1
- add "controllable-AP" header
- Conversion to BA format? HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hoaf.