rojo-rbx / rbx-dom Goto Github PK
View Code? Open in Web Editor NEWRoblox DOM and (de)serialization implementation in Rust
License: MIT License
Roblox DOM and (de)serialization implementation in Rust
License: MIT License
rbx_dom_weak's RbxTree
API and related items aren't very idiomatic.
Rbx
prefix from everything?
RbxTree
-> Tree
, RbxInstanceProperties
-> InstanceProperties
RbxInstance
-> RbxInstanceProperties
RbxInstance
methods to remove get_
prefix
get_id()
-> id()
RbxInstance
references?RbxInstanceProperties
to be a builder with private fields?rbx_dom_weak should have an API to fudge the type of a value over to another value if possible. This is useful for increasing content compatibility in deserialization, as well as enabling backward-compatibility when the types of properties change.
A baseline API was introduced in #32, but it should be refined and documented.
Necessary for Kampfkarren/selene#59.
This idea formed when we got types like ColorSequence
that are useful to reason about on their own.
It'd be nice to have actual types for values like Color3
and Vector2
so that we don't have to continually talk about their representations, [f32; 3]
and [f32; 2]
.
This would break existing consumers of rbx_dom_weak, but would be a pretty nice improvement.
Right now they serialize like this:
<BinaryString name="Tags"><![CDATA[]]></BinaryString>
but they should serialize like this instead:
<BinaryString name="Tags"></BinaryString>
This is safe because it lines up with how Roblox serializes these values.
The errors given by rbx_xml and rbx_binary are terrible.
Using an rbxlx version of the open source version of Miner's Haven as a benchmark.
$ cat decode.lua
remodel.readPlaceFile("minershaven.rbxlx")
$ time remodel decode.lua
real 0m24.496s
user 0m0.000s
sys 0m0.000s
$ du -sh minershaven.rbxl minershaven.rbxlx
7.6M minershaven.rbxl
143M minershaven.rbxlx
We might be able to recover some time switching to quick-xml, according to some profiling done by @Kampfkarren. Further investment in the binary format is the end-all solution to this, however.
MaxPlayers
serializes as MaxPlayersInternal
PreferredPlayers
serializes as PreferredPlayersInternal
Sometimes classes aren't in the reflection database, like if they'd been added since the reflection database was generated or you're using a custom Studio build.
You'll get an error here:
rbx-dom/rbx_dom_lua/src/init.lua
Line 11 in f1dfe97
Right now, each instance in WeakDom
contains a Vec<Ref>
for children. Instead, we can make each node keep the referent of its first child and its next sibling. This will stop us needing to allocate a children table for each node.
This approach is commonplace in DOM implementations, including (I believe) many web browser's HTML DOM implementations.
This might not make total sense since it'll result in massive monomorphization with varying depths of &mut
reference and might cause lots of indirection.
Some properties like MeshPart.MeshId
are marked as read-write by our input data, but in practice cannot be written to. We should be able to write back the same value that the property already has in order to detect problems here without needing to manually patch.
Similarly, we currently skip trying to collect default property info for properties that throw errors when accessed. Instead, we can downgrade their scriptability status.
It'd be great to make rbx-dom the go-to public implementation of the Roblox DOM and file formats. This would help cement that idea.
It should be possible to configure both the serialization and deserialization methods in rbx_xml.
The only big option I think we need right now is the ability to turn off reflection-driven serialization.
This is a good opportunity to break the API to pick better names. I like the idea of picking names inspired by serde_json's API (but with a better name for the options struct):
rbx_xml::from_str(tree: &mut RbxTree, id: RbxId, source: &str, options: DeOptions)
rbx_xml::to_writer(output: &mut W, tree: &RbxTree, ids: &[RbxId], options: SeOptions)
A good example of this is BasePart.Color3
, which should serialize to a property called Color3uint8
of type Coloruint8
. Instead, rbx_xml currently serializes it as a property named Color3uint8
with the type Color3
-- it missed the conversion!
This happens because we pull the type we want from the canonical property descriptor instead of the serialized name descriptor. If serialized_name()
is Some
, we should look up that property descriptor and convert to its type instead.
I am currently getting this error:
Message("don\'t know how to decode this prop type")
This is a pretty bad error, it doesn't even say the property type.
Depends on #60.
Right now, refs serialized to XML are transformed to match how Roblox generates its GUID-like values:
Instead of doing this, we should probably be keeping a ref map and rewriting them to IDs as instances are deserialized, since we don't treat the instance's ref as meaningful!
I was going to use rbx_reflection to generate a .luacheckrc
-like file, and while methods are described in detail in the API dump, they're not accessible at all in rbx_reflection.
One big advantage of using Rust as the backbone for a DOM crate like this is that it can target WebAssembly without lugging around a runtime or libc like Emscripten.
It should be pretty easy to create a binding library around the rbx-dom ecosystem using wasm-bindgen. This would open the door to having a browser-based model viewer and editor, which would be awesome!
Roblox's binary format doesn't distinguish between strings meant for humans to read and "strings" that are just binary blobs.
This makes Rust angry, since the current version of rbx_binary tries to jam both into String
, which is required to be valid UTF-8. We can fix this with reflection guidance for rbx_binary, but there may be a quicker fix (if the string isn't valid UTF-8, just throw it in as a BinaryString
?).
Right now, when rbx_xml fails to convert a value to the right type when serializing or deserializing, it silently lets the value continue as-is.
We should consider making it fail, or at least fail by default.
Right now the reflection code runs at a higher security level than most code consuming rbx_dom_lua.
We can fix this by either changing where we put the generated plugin, or porting the tool to use https://github.com/LPGhatguy/run-in-roblox.
There might be value in running the generation code at multiple security levels and merging the results in order to collect security information and paint a more complete picture.
rbx_xml's errors are better, but rbx_binary's are still really bad.
It looks like Roblox's XML BinaryString decoder chokes on input that isn't wrapped at the 72 byte mark. I don't think it's an email client, but we should give it what it wants anyways.
BillboardGui's MaxDistance is INF and this presumably causes issues. I assume this happens for nan too.
Got this error while writing a program:
113 | match dbg!(default.try_convert_ref(value.get_type())) {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `rbx_dom_weak::value::RbxValueConversion` cannot be formatted using `{:?}` because it doesn't implement `std::fmt::Debug`
Currently the reflection database is entirely public fields. We should replace those with getters, since everything is read-only anyways.
This would be a good time to fix up the names of these structs as well to mention words like "reflection" and "descriptor" instead of just prefixing everything with Rbx
.
Despite #33, INF/NAN does not work in Vector2 (and others with the same thought process) likely because they use the direct float parse methods instead of the one used for Float32/Float64.
https://github.com/LPGhatguy/rbx-dom/blob/master/rbx_xml/src/types/vectors.rs#L37
For Fire
, Size
serializes as size_xml
, and another property I can't remember has a weird name too. Need to add an XML<->Canonical name mapping for Fire
.
Perhaps this could be an optional feature?
The signature of RbxTree::remove_instance
seems like it would just return None
if the instance doesn't exist in the tree, but instead it panics because of the way it calls orphan_instance
.
The only piece of metadata in the engine is ExplicitAutoJoints
right now. For the best compatibility and least-surprising behavior, we should always write that value to true
in the metadata section of all formats.
Since there might be more metadata added to the engine related to content versioning, rbx_dom_weak
and related packages should probably gear up for handling more metadata. This can happen either as a property of a tree or as a parameter to serialization.
One way to approach this would be to introduce a new member to WeakDom
named metadata
, perhaps of type HashMap<String, String>
.
I think the tests we've got for the serialization libraries are insufficient!
Rojo adopted the 'insta' crate for this and it's worked very well. I think it would help us be more confident with changes like #76.
This should be pretty straightforward in the context of RbxValue::try_convert_ref
. I'm not sure if we want to bother trying to go the other direction since it's lossy.
I'm not sure if I'm using it wrong, but rbx-xml doesn't seem to add anything other than the initial root node you create. Here was the code I was using:
lazy_static! {
#[rustfmt::skip]
static ref HATS_TREE: RbxTree = {
let mut tree = RbxTree::new(RbxInstance {
name: "Hats".to_string(),
class_name: "Hats".to_string(),
properties: HashMap::new(),
});
let root_id = tree.get_root_id();
rbx_xml::decode(
&mut tree,
root_id,
fs::File::open("./test/roblox.rbxmx").expect("Couldn't read roblox.rbxmx"),
).expect("rbxmx failed to parse");
tree
};
}
Followed by:
println!("{:?}", HATS_TREE.descendants(HATS_TREE.get_root_id()).collect::<Vec<_>>());
When this is used, it only shows the root node and no descendants.
[RootedRbxInstance { instance: RbxInstance { name: "Hats", class_name: "Hats", properties: {} }, id: RbxId(Uuid([15, 19, 159, 200, 130, 135, 75, 151, 140, 168, 23, 144, 220, 6, 153, 57])), children: [], parent: None }]
rbx-binary on the other hand shows everything as expected.
I'm pretty confident that the reflection database generated in rbx_reflection
is pretty solid, and in the cases it isn't we can apply temporary fixups.
In order to make rbx-dom more compatible with Roblox, we should be able to refactor both rbx_xml
and rbx_binary
to use the reflection database instead of blindly deserializing and serializing every property.
I think I'll start with rbx_binary
since it needs attention anyways.
This looks like it was an oversight.
It might be a small breaking change to fix this iterator, but I'm not super worried about it since it's generally only useful for debugging. It may affect Rojo.
There are a few minor heuristics around the reflection database that exist right now. We should expand tests in a couple ways:
Strings are already marked as an ambiguous case, since they can resolve to either a string or an enum variant.
We should also allow strings to resolve to Content
, and potentially also BinaryString
depending on how hip we're feeling.
The Faces
type is only used in the Handles
instance.
The Ray
type is only serialized in the RayValue
instance, but would still be useful for us to implement.
Right now, we have a weakly-typed DOM implementation. It'd be cool to have a strongly-typed one, too.
Right now, unimplemented properties waffle between failing the operation in progress to panicking using unimplemented!()
. We should migrate all of the cases where rbx-dom libraries panic and turn them into proper errors instead.
Opacity -> opacity_xml
Size -> size_xml
RiseVelocity -> riseVelocity_xml
It'd be useful to make RbxValue
operate similarly to std::borrow::Cow
, where they can contain owned or borrowed data. Adding a single lifetime parameter to RbxValue
and seeing where necessary changes fall out should be insightful!
Most stuff can move from what is currently RbxValue
to RbxValue<'static>
, but some things like Rojo's instance snapshot system could benefit pretty greatly from having borrowed RbxValue
objects!
The Axes
type is only used in the ArcHandles
instance.
The fact that rbx_reflection is a generated Rust module is a neat trick, but ultimately problematic for bugs and implementation maintenance. We also have to maintain a parallel generator for rbx_dom_lua, and sometimes feature parity is messed up.
Additionally, it becomes difficult to update the reflection information out-of-band. It'd be nice to distribute new reflection information automatically that applications could update to, but baking it into the executable makes that more difficult than it should be.
I think there are a reasonable set of steps to solve this problem:
There may be a small performance cost to switching from generated code to serialized modules, but if we choose the right format, it should be minimal.
We should be able to create a shared string dictionary per tree. When moving instances between trees, it should be possible to add entries from the shared string table.
I'm not super sure how the entries should be cleaned up, especially since it's possible for consumers of rbx_dom_weak to mutate properties arbitrarily. Maybe that'll be left up to consumers? Maybe there could be some ref-counting nonsense?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.