Coder Social home page Coder Social logo

pdblister's Introduction

Summary

This is a tiny unofficial project meant to be a quick alternative to symchk for miscellaneous tasks, such as generating manifests and downloading symbols. This mimics symchk of the form symchk /om manifest /r <path> but only looks for MZ/PE files.

Due to symchk doing some weird things it can often crash or get stuck in infinite loops. Thus this is a stricter (and much faster) alternative.

The output manifest is compatible with symchk. If you want to use symchk in lieu of this tool, use symchk /im manifest /s <symbol path>

โš ๏ธ Note: This tool is unstable! The CLI interface may change at any point, without warning. If you need programmatic stability (e.g. for automation), please pin your install to a specific revision.

Check out how fast this tool is:

Quick Start

# On your target
> cargo run --release -- manifest C:\Windows\System32

# On an online machine
> cargo run --release -- download SRV*C:\Symbols*https://msdl.microsoft.com/download/symbols

Downloading a single PDB file

> cargo run --release -- download_single SRV*C:\Symbols*https://msdl.microsoft.com/download/symbols C:\Windows\System32\notepad.exe

Future

Randomizing the order of the files in the manifest would make downloads more consistant by not having any filesystem locality bias in the files.

Deduping the files in the manifests could also help, but this isn't a big deal shrug

We could potentially offer a symchk-compatible subcommand: #5

A "server mode" could be implemented so that other tools written in different languages could take advantage of our functionality: #7

Performance

This tool tries to do everything in memory if it can. Lists all files first then does all the parsing (this has random accesses to files without mapping so it could be improved, but it doesn't really seem to be an issue, this random access only occurs if it sees an MZ and PE header and everything is valid).

It also generates the manifest in memory and dumps it out in one swoop, this is one large bottleneck original symchk has.

Then for downloads it chomps through a manifest file asynchronously, at up to 16 files at the same time! The original symchk only peaks at about 3-4 Mbps of network usage, but this tool saturates my internet connection at 400 Mbps.

pdblister's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pdblister's Issues

Add a `pdbstore` command

This command should closely echo the filestore command - given a directory, search it for all PDBs and organize them into a symsrv-compatible format.

> pdblister pdbstore C:\Dev\Build C:\Symbols
pdbstore <filepath> <storepath>

Update the README

The README currently does not reflect all the functionality offered by pdblister.
Either update the documentation or remove it (pointing to the --help flag).

Default local cache path (short-form server string)

Symchk supports specifying just a symbol server URL, with the expectation that it will download symbols from that URL and cache them to a pre-determined cache path.

For example, the "server string" https://msdl.microsoft.com/download/symbols will get implicitly expanded to SRV**https://msdl.microsoft.com/download/symbols, which is expanded to SRV*C:\ProgramData\dbg\sym*https://msdl.microsoft.com/download/symbols.

Documentation:
msedge_5O6Wza5s3X

Generate a manifest from a VHD/VHDX file

Add a new command that should:

  1. Mount a VHD
  2. Generate a manifest from its contents on some partition, usually the Windows partition. Maybe add path filter options to keep the manifest reasonable.

Add JSON output for use in automation

Add an --output-format json that formats and prints our output in JSON for use in automation.

For example,

pdblister download_single SRV*...*... my.exe --output-format json
{"status": "success", "path": "C:\\Path\\to\\sym\\my.pdb\\ABC123\\my.pdb"}

Add a server mode for pdblister

We should investigate the usefulness of adding a server mode to pdblister, such that we will listen for JSON requests on stdin and respond to them on stdout.

For example, someone could request that we download a PDB (or return an existing one) via the following flow (-->: stdin, <--: stdout):

--> {"id": 0, "req": "download", "type": "pdb", "name": "abc.pdb", "hash": "ABC123456789", "age": 0}
<-- {"id": 0, "req": "progress", "percent": 50}
<-- {"id": 0, "req": "progress", "percent": 90}
<-- {"id": 0, "req": "download", "status": "success", "path": "C:\Symbols\abc.pdb\ABC123456789\abc.pdb"}

This would make pdblister a useful tool for non-Rust languages as well, since any language can launch a program and speak JSON.
This will also allow me to lazily sidestep having to do a non-async implementation for Rust libraries that wish to maintain a minimal build profile.

Support `file.ptr` redirection files

As per the documentation - PDBs may not be stored directly inside the symbol server.

We should adjust the search algorithm to search for the following (with an example target being ntkrnlmp.exe):

  • ntkrnlmp.pd_ (compressed)
  • ntkrnlmp.pdb (original)
  • file.ptr (redirect to URL in contents)

Support symchk `/ip` option (generate manifest from running PID)

This mode is pretty convenient when you want to debug/analyze a single process.
Rather than downloading an entire directory (usually C:\Windows\System32) - I'd rather simply synchronize symbols for the process and be done.

This would probably require us to change the parameters for the download command and break whatever backwards compatibility there may be (or implicitly default to /r mode if no flag is specified).

Reference: https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/symchk-command-line-options

Verify downloaded PDBs

See if ADO or classic symbol servers return any sort of file hash for PDBs.
If they do, use it to verify downloaded PDBs and re-download if local PDB hash does not match remote's hash.

Refactor command-line options to provide some level of compatibility with symchk

This will make it easier to use this tool as a drop-in replacement for symchk down the line.

Input options:

  • /if <Filename>
  • /im <Manifest>

PDB options:

  • /pa Public and private symbols
  • /pf Verify that PDBs contain full source information
  • /ps Verify that PDBs are stripped
  • /pt Verify that PDBs are stripped but contain type information

Output options:

  • /om <Manifest>

Investigate/support Azure DevOps MI/SP auth

ADO is migrating over to managed identity / service principal auth over PAT authentication.

In order to support this, we would need to have a way to capture a bearer token from a caller and pass it on to any request we make to Azure DevOps.
See if there's a way to do it by encoding the bearer token into the URL. If not, find a standardized way to allow users to pass in the bearer token to us.

Add an option to ensure downloaded PDBs are stripped or non-stripped

Analogous to symchk's PDB options:

  • /pa Public and private symbols
  • /pf Verify that PDBs contain full source information
  • /ps Verify that PDBs are stripped
  • /pt Verify that PDBs are stripped but contain type information

Notably, symchk does not check PDBs that are already cached. We should diverge and check those PDBs too, and re-download them if they do not comply with the options specified.

Waiting on: getsentry/pdb#143

Unclear error message presented if server URL is empty

Running `target\debug\pdblister.exe download srv*D:\Symbols*`
Original manifest has 1 PDBs
Deduped manifest has 1 PDBs
Failed: failed to connect to server SRV*D:\Symbols*: relative URL without a base

This occurs sometimes in our automation if the server URL resolves to an empty string due to an unset environment variable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.