The current capture behavior us to record both the input and output/return values for an API call in single packet, which is written to the capture file after the API call has returned. This behavior could be modified to split the input and output values into separate packets, with inputs written before the API call and outputs written after the API call.
This could be useful for capturing crashes where the crash happened in the API call due to invalid inputs. With the current behavior of logging the packet after the API call, the packet would not be written to the file in the case of a crash. Note that this is only true for API calls with return values and/or output parameters. For API calls that only have input parameters, the packet is logged before the API call.
In theory, this could also be used for basic synchronization with multi-threaded replay. Packets captured in the following order would indicate that threads 1 and 2 executed API calls at roughly the same time, while thread 3 executed its API call after thread 1 and 2 completed their calls:
- Thread 1 Input
- Thread 2 Input
- Thread 2 Output
- Thread 1 Output
- Thread 3 Input
- Thread 3 Output
The replay tool would interpret this to mean that the calls from threads 1 and 2 could be executed in parallel by worker threads, but thread 3's call should not be executed until the other calls complete. Multi-threaded replay would then follow a pattern where it reads an Input packet from the file and dispatches that packet to a worker thread. It would continue reading and dispatching Input packets until it encounters an Output packet. When encountering an Output packet, the dispatch thread would block until the associated worker thread completes its API call and commits any resulting state changes (eg. mappings for newly created handles) before continuing to read and process packets from the capture file.
Some downsides to this approach are that every API call, even calls without return values or output parameters, would need an Output packet, and that it complicates single threaded replay. For single threaded replay, Input packets would need to be queued until the associated Output packet is read. To maintain the proper API call order, queue entries can only be processed when the item at the front of the queue is complete.
Looks like Dustin did some work on this: dev...dustin-lunarg:gfxreconstruct:dustin_separate_in_out_packets