saladdais / hippolyzer Goto Github PK

View Code? Open in Web Editor NEW

13.0 13.0 5.0 5.67 MB

Intercepting proxy + analysis toolkit for Second Life compatible virtual worlds

License: GNU Lesser General Public License v3.0

Python 100.00%

analysis gridproxy proxy pyogp python second-life secondlife sl

hippolyzer's People

Contributors

Stargazers

Watchers

Forkers

trendingtechnology beyonddiana beqjanus scatterp2 callcolor

hippolyzer's Issues

Add decode-only fast path for ObjectUpdateCompressed

When flying across the mainland, the main thing occupying proxy time is decoding ObjectUpdateCompressed packets. The proxy gets a flood of them all at once whenever a new object enters draw distance, unlike other ObjectUpdates that come in at a reasonable pace. Depending on how high the network throttle is set, this can cause a lot of packets to drop and trigger re-sends. The declarative serialization API we use for describing most structures like the ObjectUpdateCompressed Data field is intentionally not heavily optimized, so it's not ideal for this case.

It would be worth writing a decode-only fast path for ObjectUpdateCompressed that the ObjectManager class could use. ObjectManager._normalize_object_update_compressed() takes a Block and returns a dict of fields to apply as an update to any existing Object, so it could be slotted in there.

Better handling of objects not known to the proxy due to viewer object cache hits

Whenever the viewer enters a sim, the sim will send cache probes with ObjectUpdateCached containing a LocalID and CRC. If the LocalID isn't cached or there's a CRC mismatch the viewer will request it with RequestMultipleObjects. Right now the proxy doesn't know about any objects that the viewer didn't request due to them being present in its own viewer object cache.

Right now we hack around this in a few places by always requesting missing objects when they're referenced in an ObjectSelect packet, but it can cause problems if you're trying to look at avatar positions and the avatar sits on an object the proxy doesn't know about. Object positions are always relative to their parent, so not knowing the parent's position makes it hard to determine the avatar's position.

Possible approaches:

Always send VOCACHE_IS_EMPTY flag in RegionHandshakeReply so cache probes don't get sent, and the sim will always send full ObjectUpdates instead. This is the least brittle, but most wasteful because the sim always has to send ObjectUpdates for objects your viewer may have cached.
Parse the viewer's vocache and pull objects into ObjectManager on ObjectUpdateCached LocalID + CRC match. Might be brittle, and I don't know if vocache supports concurrent reads & writes well. Discovering viewer cache locations is also a problem.
Have the proxy maintain it's own vocache. Least likely to cause problems with the viewer, but would involve the most code for cache bookkeeping.

LEAP bridge and associated std(in/out) -> TCP forwarding agent

SL's viewer has had automation support through a subprocess + std(in/out) communication scheme for quite a while now. Recently it's been used for the Puppetry work, but it's generally useful for a number of other things, including basic UI automation and high-level event subscription.

Using LEAP the official way is annoying because you're bound by what interfaces are exposed over LEAP, and fiddling with the scripts usually means restarting the viewer over and over to get things right. You also can't use stdin or stdout in your scripts for obvious reasons.

It'd be nice to allow proxied viewers to be automated through hot-reloaded addons or the REPL via the LEAP API. Easiest thing to do would be to make a netcat-style std(in/out) -> TCP forwarding agent. Hippolyzer could receive inbound connections from those agents and be able to control multiple LEAP-automated viewers at once. LEAP connections could be associated with Session objects through data extracted via the LEAP API.

Windows viewers load assets very slowly

Seems to either be caused by using the HTTP(S)_PROXY env vars or some weird HTTP handling code only used on Windows. Doesn't manifest when hitting Hippolyzer in a Windows VM with a Linux viewer. Should check if it still happens when manually defining the HTTP proxy in the viewer settings.

Support reliable LLUDP message sending

Would be nice to support LLUDP's reliability mechanism for our injected messages. Messages should be kept in a list with the retry count and periodically re-sent until acked or the retry limit has been reached. Need to be careful to change outgoing StartPingChecks to take our injected packets into account for the OldestUnacked field.

Generate typed message classes based on message template

These should be generated with a script and put in the repo so people can take advantage of their IDE's autocomplete writing messages. The classes should all inherit from the base Message class so un-typed access via __getitem__ is still possible and should only be a thin, typed wrapper around that API. Same thing for any Block subclasses, need to think about how specifying the unpacked form of a var's value should work for those (maybe a special wrapper class for packed values so isinstance() can be used to detect it rather than the _ suffix for names being used right now?)

Create PyInstaller or cx_Freeze Windows bundle on release events

Will have to write some code to strip out unused Qt DLLs for bluetooth and webview because they're huge.

Add support for unknown new block in AvatarAppearance

There appears to be a new block at the end of AvatarAppearance in some server versions, figure out what it is and add it to the message templates (doesn't appear to be in LL's message template?)

Improve performance

Add runtime ObjectManager conformance tests based on newview VOCache

Since we can hook into newview's object cache, we know what newview thinks the state of the scene graph is when entering the region, and when the region dies. Should take advantage of this to make sure Hippolyzer's ObjectManager has the same final result as the viewer does by re-reading the VOCache just after the viewer writes it and comparing to ObjectManager's contents.

Can probably expect some drift in position and rotation due to Hippolyzer not doing velocity interpolation, everything else should be the same.

We want Hippolyzer's object handling to be as close as possible to newview's so clients can diff their object state against Hippolyzer's to find bugs in their object handling code.

Implement (De)serialization for LayerData packets

Needed to get details about terrain, water, wind, etc. It's a nightmare format based on DCT blocks. Probably need an actual bitstream implementation to read them.

Add message log loading / saving

Depends on #19 since importing should open a new log window. Would be helpful for complex issues that span multiple messages.

For ex you could load a message log dump programmatically, then replay inbound ObjectUpdates through an ObjectManager to reconstruct the client's understanding of the scene graph at a particular point in time.

Better EventQueue event injection support

Right now an event can only be injected when we actually receive an event over the real EQ due to how we're intercepting the response. That means injected events can be delayed by up to 30 seconds. Not nice.

It might make sense to switch to a strategy where we hold onto the flow ID for event queue requests with pending server responses and preempt the server response by injecting our own response. It's not clear to me if mitmproxy has support for closing the server half of the proxied connection for an in-flight request and injecting a response, but that's the obvious choice.

Write more docs, split readme up

Teleports sometimes cause a disconnect when attempted right before Event Queue's long-polling timeout

The sim assumes that once it's sent the TeleportFinish event over the event queue, it can kill the event queue cap and that the viewer has been handed off to the new region. If viewer times out the EQ connection just as the TeleportFinish is sent, then the proxy will have read the TeleportFinish response, but the viewer won't have. This should be covered by the EventQueue's explicit acking mechanism, but it doesn't seem to work properly. It appears the server considers an event acked so long as the response bytes were sent off and immediately discards them. It should only discard messages once the viewer polls with an id that's not greater than the ack value POSTed by the viewer, but it discards them unconditionally. I'm not sure if this is intentional or if it's always been like this.

Since the viewer won't know it was sent the TeleportFinish it will keep trying to read the event queue CAP, which will never re-serve the TeleportFinish. CrossedRegion probably has the same problem, I haven't tested.

This seems to be a general problem with SL that's made worse when using an HTTP proxy, since the proxy may leave its connection to the server open and consume the event after the client timed out their connection. We can hack around that by always storing the last EQ response for a sim if there were events, along with the client's ack value in the request.

The sim's EQ implementation will need to be changed to actually make use of the ack value that gets posted and discard events that haven't been acked for this to be fully fixed.

Update docs to mention how to work around Firestorm proxy issues

Firestorm didn't pull https://bitbucket.org/lindenlab/viewer/commits/454c7f4543688126b2fa5c0560710f5a1733702e into their latest release, so manually specifying a proxy is broken. Mention LinHippoAutoProxy or the debug settings or something. I don't particularly care about CEF / Dullahan proxy support.

Add support for Python 3.11

At least the recordclass dependency is going to need an upgrade, may be others as well.

Support multiple log windows

Useful to narrow down on specific sessions per-window and compare. Would require decoupling some object ownership from the log window itself and moving it to the app in proxy_gui.py.

Rework Object handling to better map to reality

Right now objects are handled at the region level, which breaks handling of sim->sim object crossings. The proxy also assumes that whatever region the objectupdate was received from is the owner of the object it's updating, which isn't correct. indra only goes off of the RegionHandle in the body of each ObjectUpdate type, and that's what triggers object handoff. Getting an Object's global pos isn't possible either because they don't know what region they belong to.

Need to:

Store region handle on objects so they know what region they belong to and can calculate their global position
Switch to doing internal lookups by FullID. LocalID + Handle is only ever used to lookup the FullID in indra. Any update could potentially change an object's localid
Move ObjectManager to Session. That model works better for region handoff. Keep something simple on the region so lookups by localid can still be done?
Handle region handoff correctly and add tests

Prevent GUI paints from blocking proxy activity and vice-versa

Right now all code other than the mitmproxy wrapper runs in the same process and thread. This leads to issues where GUI paints can block proxy activity and vice-versa, mostly noticeable when a lot of messages are being logged at once.

On the one hand I don't normally notice the perf hit, and having everything on one thread allows writing relatively simple GUI code for addons without needing to use signals / slots (like the blueish object list.) On the other hand, a few people have told me that they disable the message log for perf reasons.

In the short term, it'd make sense to batch up additions to the message log list and only try to draw every 0.1 seconds, like we did before.

Longer-term, I'll have to come up with cross-thread implementations of the AddonManager.UI APIs using signals and slots, and figure out how to run UIs like the blueish object list in the UI thread.

LLMesh -> Collada -> LLMesh conversion code to allow for in-proxy mesh upload

This would be a good first step to writing mesh upload code totally independent of the viewer, allowing prototyping of new importer features like glTF support (via https://github.com/SaladDais/impasse,) as well as file watchers for the local mesh feature.

Having code to allow round-tripping LLMesh -> Collada -> LLMesh is probably the easiest way to ensure we haven't gotten confused about LLMesh or Collada semantics in our import code.

We should make an example .dae that uses joint positions, vertex weights, multiple materials and multiple instances of the same mesh data, and then log the LLMesh-serialized version the viewer's uploader sends off to make sure that:

Our own conversion code gives the same result as the viewer when converting the .dae to LLMesh (assuming no normal or LoD generation)
Our LLMesh upload data -> Collada converter generates a .dae that's semantically equivalent to the original input .dae

Probable bug in Circuit InjectionTracker's drop tracking

Attempting to re-send previously dropped 1523:AgentUpdate, did we ack?
Attempting to re-send previously dropped 1607:ViewerEffect, did we ack?
Attempting to re-send previously dropped 1720:ViewerEffect, did we ack?

Hippolyzer/hippolyzer/lib/proxy/circuit.py

Line 75 in a1bbfbf

# was_dropped needs the unmodified packet ID

Nothing should have dropped packets of those types, so packet IDs must be getting re-used. Likely packet IDs are getting out of sync somewhere and making the proxy think there's a resend happening.

First noticed this when using the turbo object inventory addon, so it's possible there's some faulty logic in message.take() or MessageHandler subscription stuff too.

Better handling of the various disparate inventory representations

Depending on how the inventory is requested and what is requesting it, inventory contents and events relating to them may have 5 or 6 different incompatible representations.

InventoryAPIv3: Ok, you get an LLSD object back, U32s are binary LLSD fields and type fields are strings.
FetchInventory2: Old and bad, but still used! You get a sort of similar LLSD object back except U32s aren't binary fields, they're just converted to S32 integers. Also all of the asset type / inv type fields are numeric rather than the string form of the types.
Login inventory skeleton: Pretty similar to the above, I guess, but not LLSD because it's in the XML-RPC payload.
BulkUpdateInventory, UpdateCreateInventoryItem, ...: Old templated message things that sometimes get sent as LLSD over the EQ, sometimes binary over UDP. Being that they're templated messages their structures are flat and fixed, but they're relatively easy to reason about if you remember the block names for whatever particular message.
Object inventories: Weird old proprietary textual schema similar to the one that skins and shapes use. Nasty to parse and serialize unambiguously, so other uses have moved away from it.
Inventory cache: Sort of similar to InventoryAPIv3 format, except everything is serialized in newline-delimited LLSD notation format.

Right now only AIS and legacy schema formats are supported well, with some conversion functions translating between AIS and the templated UDP stuff. Need better support for the other representations of inventory updates, and to add in-place updating of inventory models to apply them.

SOCKS5 proxying broken with official Windows viewers

Didn't notice this since I was only running the proxy in a Windows VM. As described in https://jira.secondlife.com/browse/BUG-134040 Windows viewers write broken SOCKS 5 commands on the wire and expect broken commands back. Seems to be due to struct padding differences cross-compiler when the viewer assumes structs will be un-padded and copies them directly to the wire.

I don't think this is ever getting patched and we want to support existing broken viewers. Sniffing for trailing nulls past the end of the authentication methods field in the handshake should allow us to a invoke "broken WINSOCKS mode".

Rewrite git history to add pyogp commits with correct committer field

Because I used git am to import the pyogp commits from pyogp mercurial, it left the original authors as the author of the commit but me as the commiter. Github seems to count contributions by committer, so people who'd written code for pyogp aren't showing up in the contributors sidebar.

Leaving it that way seems rude, so let's figure out a way to rebase on top of properly attributed commits. Will cause a conflict for anyone who has existing master checkouts but they can deal.