chrivers / transwarp Goto Github PK
View Code? Open in Web Editor NEWTranswarp compiler - a python3 implementation of a Simple Type Format parser and renderer
License: GNU General Public License v3.0
Transwarp compiler - a python3 implementation of a Simple Type Format parser and renderer
License: GNU General Public License v3.0
(This is an attempt at finalizing the syntax. Going to pull this out of issue #1, so it's easier to follow)
I've made an attempt at cleaning up the syntax for the STF files. Take a look here:
https://github.com/chrivers/isolinear-chips/tree/syntax-varation-alpha
(the compiler doesn't handle it yet.. I don't want to spend time updating it, until we agree somewhat on the syntax)
Here are the changes:
A:
object Creature(bitmask=3)
....
B:
!bitmask=3
object Creature
....
C:
object Creature
!bitmask = 3
What do you think? This, or something else?
Here are some concrete changes for isolinear chips:
enum<MainScreenView, u32>
syntax is gone! Because we can make references, we now have the simpler form: enums::MainScreenView<u32>
. "enums" in this context means the enums file. This can be checked for validity by the compiler.ships: sizedarray<structs::Ship, 8>
. This, of course, referes to the "structs" file.In general, I think the code is quite a lot easier to read now. It should also be significantly easier to parse and run templates on, since there are many, many fewer special cases. Death to non-generic sections!
ping @NoseyNick @mrfishie - thoughts? ;-)
(continued from artemis-nerds/protocol-docs#50)
All section names (enum, object, parser, etc) are going to be completely free-form! This means that we can just use the ones we think are the most descriptive.
By this, do you mean that sections don't have a type?
No, rather that I think we will only need 1 section syntax. For example, you can very clearly see the similarities between "struct" and "object". If we make the arguments optional (or find another way to represent the data), there is literally no difference.
Then, the only difference is the name. This is used so the templates can say "give me the enum
named foo" or "give me all object
definitions". The following would be valid:
flumf BingiBongi
This
That
stuff: really<there, are, no, restrictions>
well: except_for<the_grammar_rules>
(ok, I might be tired, but I think you get the idea).
Since this is only markup, we can decide on a convention we like for the artemis protocol spec. Other projects might decide on other conventions, and so on. It's neither our duty nor place to make any such restrictions.
To clarify: This is NOT how the system works right now, but it's an idea I've been toying with from the beginning. I'm going to try it out.
The compiler doesn't care which stf file a definition comes from, so grouping it into different files is just to make it easier for us.
Yeah, I thought that would be the case, it's just that all the files seem to be named by the types of sections they contain, and then there's this random enum.
True - it sticks out a bit. If we had free-form section names, we could name it something more appropriate. Like:
canonicalnames FrameType
valueFloat = 0x0351a5ac
shipSystemSync = 0x077e9f3c
clientConsoles = 0x19c6e2d4
gmButton = 0x26faacb9
...
Just curious, how does the parser decide which stf files to open? Does it just open all of the ones in a folder?
Yes, they're all collected into one big data structure, which is (at the moment) a tree of all defined sections.
Conceptually, this would be:
+ root
|
+ enums
| | AlertStatus
| | AudioCommand
| \ ...
|
+ structs
| \ Ship
|
+ packets
| + ClientPacket
| | ...
| \ ServerPacket
| ...
|
\ objects
+ Anomaly
\ Base
...
This extends all the way into the fields, but I think this is enough ascii-art to get the idea across :)
I put it in parser.stf, because that's where it is used
But AFAIK it's not actually used by the STF files at all, but instead read by the parser, which is why I think it should be a different type (all other enums are used by the program) - but I suppose this doesn't matter anymore with the free-form sections.
Sorry, "used" was a poor term here. It's where a human would read it, when writing that section ;-)
It's (textually) where it is referenced, so it made maintenance easier. But as I said, unless we plan on giving the files namespaces themselves (perhaps that's not a bad idea!), then it doesn't matter to the output.
but there's really no need for the stf itself to be. It's really just a markup language.
Yeah - I suppose my stance on this project has been that we should make the structure language specifically to describe the Artemis formats, as it allows us to simplify some things in the documentation. But of course, there are pros and cons to each side.
I think there's very little (if any at all) to gain from taking Artemis-specific shortcuts. I'm not even sure what it would be?
I agree that if it constrains us from reaching a goal, we could revise the position. But I don't think that will be the case, since it works, today :)
Can you send me an email, and we can coordinate further?
Sent :)
Received!
You know, that bugs me too. I think the best option would be to find a short, memorable name. That's what I tried to do with the other Artemis-related projects I made. Suggestions welcome, it's still in beta :)
If you're wanting to publicise this and encourage people to use it for other projects, IMO a descriptive name for the format would probably be best (or at least a name that somewhat gives away what the format is for) - e.g. for a Star Trek layman, Isolinear Chips probably doesn't mean a whole lot (although in this case that's probably fine as it's not really meant as a major public project).
Yeah, I allowed myself a little nerding out on the naming there, since the target audience is still the Artemis community.
Then again, except as an example, there are no ties from isolinear chips to the compiler, so I don't imagine people who are not looking for the artemis spec will bump into it, in the future.
So this would be:
object Base:
BITMASK_SIZE = 2# Name (bit 1.1, string) # # The name assigned to this base. In standard, non-custom # scenarios, base names will be unique, but there is no # guarantee that the same will be true in custom scenarios. name: string # Shields (bit 1.2, float) # # The current strength of the base's shields. front_shields: f32
Oh gosh, now we have equals signs and colons in the same sections... I get that it's to differentiate between constants and types, but it also seems very easy to mix these up while writing - perhaps there's a better way to separate these? (e.g. surround a section of constants with %, although this might be a bit ugly for enums)
Well, one simple rule could keep this controlled: All constants must come before all types. That would catch the majority of oops-my-typing kind of errors.
We really need some way to decorate the sections with non-body information, but here's another possibility:
typename SectionName(param1=value1, param2=value2)
foo: u32
This could work, too. I'm afraid it could get unwieldy though, and we lose the generality of it. It's going to be quite hard to parse this form in more than one line, and it could lead to excruciatingly long lines.
We also can't forego the names (like we do on types), since we want optional values. For example:
struct Ship210
_max_version = 210
struct Ship240
_min_version = 240
This would be a fairly clean way to add versioning information to sections.
there's no way to tell if, for instance, "f32" is the name of a struct, a built-in primitive, or something completely different?
Couldn't you just disallow using those names? It seems to me this is a bit like using a function in a language that's defined in the standard API, as opposed to one you've defined - sections could just be thought of as types defined in the program. I guess this depends on how you parse the format.
But that's the point - right now there are no defined types!
The type parser literally does not care what you write, as long as it is within the syntax. This allows the templates (and thus, the end-user project) to come up with a type description they like.
To clarify - we certainly could add a list of standard type names (u8, u16, u32, f32, string, etc.. ) and then ban those, but it doesn't really solve the problem.
For example, "ConsoleStatus" could refer either to an enum, or to ServerPacket::ConsoleStatus.
We have to find a nice unambigous way to point to places in the namespace.
I agree that the current solution isn't optimal, but stripping away just "struct" seems odd, and quite arbitrary. It also in a very real way makes the templates more complicated to write. Either that, or we need to have opinions about what constitute "standard types", but I don't like that.
It turns out the artemis protocol is so bloody inconsistent that this isn't true!
Oh wow, okay... It definitely wasn't like that back when I was working on my old parser, but clearly I haven't kept up-to-date with the documentation. After I read this I was considering pitching the idea of having enums specify a 'base type' and then places where the enum is used that don't use that type would provide their own, but I think that'd get far too annoying to work on and probably complicate the compiler considerably.
Agreed! The current solution is not perfect, but it's the least-bothersome one I could find on short notice :)
If you don't mind me asking, do you intend to implement a templating system/some kind of generator as well? Certainly, more implementations are better than fewer, but isn't it a little bit of reinventing the wheel we just made? :)
Potentially, I'm not yet sure - I may end up making a packet parser that reads these structure files and 'interprets' them on-the-fly (i.e no code gen necessary). Also, since we want other people working on Artemis-related projects to be able to use the format, I thought it'd be a good test to make sure people other than you can write a parser that makes sense ;P
That's certainly an amicable goal - perhaps we should split the grammar and parsing portions into a separate project once we agree on a version 1.0 syntax.
Regarding the compiler, I can only say I was surprised by how long it took to go from working compiler (which didn't take long at all), to polished ready-to-run tool. I'm thrilled to see where we can take this next, and I hope we can all work to improve the syntax and the tools for everybody :)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.