Coder Social home page Coder Social logo

protoscope's Introduction

Protoscope

Protobuf + Rotoscope

Protoscope is a simple, human-editable language for representing and emitting the Protobuf wire format. It is inspired by, and is significantly based on, DER ASCII, a similar tool for working with DER and BER, wire formats of ASN.1.

Unlike most Protobuf tools, it is normally ignorant of schemata specified in .proto files; it has just enough knowledge of the wire format to provide primitives for constructing messages (such as field tags, varints, and length prefixes). A disassembler is included that uses heuristics to try convert encoded Protobuf into Protoscope, although the heuristics are necessarily imperfect.

We provide the Go package github.com/protocolbuffers/protoscope, as well as the protoscope tool, which can be installed with the Go tool via

go install github.com/protocolbuffers/protoscope/cmd/protoscope...@latest

go install will place the binary in the GOBIN directory, which is ~/go/bin by default. See the docs for go install for more details.

For the language specification and basic examples, see language.txt. Example disassembly can be found under ./testdata.

Cookbook

Protoscope can be used in a number of different ways to inspect or create binary Protobuf data. This isn't the full breadth of usecases, but they are the ones Protoscope (and its ancestor, DER ASCII) were designed for.

Exploring Binary Dumps

Sometimes, while working on a library that emits wire format, it may be necessary to debug the precise output of a test failure. If your test prints out a hex string, you can use the xxd command to turn it into raw binary data and pipe it into protoscope.

Consider the following example of a message with a google.protobuf.Any field:

$ cat hexdata.txt
0a400a26747970652e676f6f676c65617069732e636f6d2f70726f746f332e546573744d65737361676512161005420e65787065637465645f76616c756500000000
$ xxd -r -ps hexdata.txt | protoscope
1: {
  1: {"type.googleapis.com/proto3.TestMessage"}
  2: {`1005420e65787065637465645f76616c756500000000`}
}
$ xxd -r -ps <<< "1005420e65787065637465645f76616c756500000000" | protoscope
2: 5
8: {"expected_value"}
`00000000`

This reveals that four zero bytes sneaked into the output!

If your test failure output is made up of C-style escapes and text, the printf command can be used instead of xxd:

$ printf '\x10\x05B\x0eexpected_value\x00\x00\x00\x00' | protoscope
2: 5
8: {"expected_value"}
`00000000`

The protoscope command has many flags for refining the heuristic used to decode the binary.

If an encoded FileDescriptorSet proto is available that contains your message's type, you can use it to get schema-aware decoding:

$ cat hexdata.txt
086510661867206828d20130d4013d6b000000416c000000000000004d6d000000516e000000000000005d0000de42610000000000005c40680172033131357a0331313683018801758401
$ xxd -r -ps hexdata.txt | protoscope \
  -descriptor-set path/to/fds.pb -message-type unittest.TestAllTypes \
  -print-field-names
1: 101        # optional_int32
2: 102        # optional_int64
3: 103        # optional_uint32
4: 104        # optional_uint64
5: 105z       # optional_sint32
6: 106z       # optional_sint64
7: 107i32     # optional_fixed32
8: 108i64     # optional_fixed64
9: 109i32     # optional_sfixed32
10: 110i64    # optional_sfixed64
11: 111.0i32  # optional_float, 0x42de0000i32
12: 112.0     # optional_double, 0x405c000000000000i64
13: true      # optional_bool
14: {"115"}   # optional_string
15: {"116"}   # optional_bytes
16: !{        # optionalgroup
  17: 117     # a
}

You can get an encoded FileDescriptorSet by invoking

protoc -Ipath/to/imported/protos -o my_fds.pb my_proto.proto

Modifying Existing Files

Suppose that we have a proto file foo.bin of unknown schema:

$ protoscope foo.bin
1: 42
2: {
  42: {"my awesome proto"}
}

Modifying the embedded string with a hex editor is very painful, because it's possible that the length prefix needs to be updated, which can lead to the length prefix on outer messages needing to be changed as well. This is made worse by length prefixes being varints, which may grow or shrink and feed into further outer length prefix updates.

But protoscope makes this into a simple disassemble, edit, assembly loop:

$ xxd foo.bin
00000000: 082a 1213 d202 106d 7920 6177 6573 6f6d  .*.....my awesom
00000010: 6520 7072 6f74 6f                        e proto

$ protoscope foo.bin > foo.txt  # Disassemble.
$ cat foo.txt
1: 42
2: {
  42: {"my awesome proto"}
}

$ vim foo.txt  # Make some edits.
$ cat foo.txt
1: 43
2: {
  42: {"my even more awesome awesome proto"}
}

$ protoscope -s foo.txt > foo.bin  # Reassemble.
$ xxd foo.bin
00000000: 082b 1225 d202 226d 7920 6576 656e 206d  .+.%.."my even m
00000010: 6f72 6520 6177 6573 6f6d 6520 6177 6573  ore awesome awes
00000020: 6f6d 6520 7072 6f74 6f                   ome proto

The -message-type option from above can be used when you know the schema to make it easier to find specific fields.

Describing Invalid Binaries

Because Protoscope has a very weak understanding of Protobuf, it can be used to create invalid encodings to verify that some invariant is actually checked by a production parser.

For example, the following Protoscope text can be used to create a test that ensures a too-long length prefix is rejected as invalid.

1: {
  2:LEN 5   # Explicit length prefix.
    "oops"  # One byte too short.
}

This is more conveinent than typing out bytes by hand, because Protoscope takes care of tedious details like length prefixes, varint encoding, float encoding, and other things not relevant to the test. It also permits comments, which can be used to specify why the Protoscope snippet produces a broken binary.

Protoscope itself generates test data using Protoscope, which is then checked in. Other projects can either check in binary data directly, or use the build system to invoke protoscope, such as with a Bazel genrule().

Backwards Compatibility

The Protoscope language itself may be extended over time, but the intention is for extensions to be backwards-compatible. Specifically:

  • The command-line interface to protoscope will remain compatible, though new options may be added in the future.

  • Previously valid Protoscope will remain valid and produce the same output. In particular, checking in test data as Protoscope text should be future-proof.

  • Previously invalid Protoscope may become valid in the future if the language is extended.

  • Disassembly is necessarily a heuristic, so its output may change over time, but it is guaranteed to produce Protoscope output that will reassemble to the original byte string. protoscope | protoscope -s is always equivalent to cat.

Disclaimer

This is not an official Google project.

protoscope's People

Contributors

ericsalo avatar mcy avatar protobuf-github-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

protoscope's Issues

Install instructions don't work

$ go version
go version go1.15.15 linux/amd64
$ go install github.com/protocolbuffers/protoscope/cmd/protoscope...@latest
go: warning: "github.com/protocolbuffers/protoscope/cmd/protoscope...@latest" matched no packages

Make available as static binary

Is it possible to make protoscope available as a binary for the common platforms (at least Linux)? For anyone not using go as their language it's rather unfriendly to require the whole language toolchain etc. to install.

Thanks!

Bind

Hello you, Any way for bind struct?

dont assume bytes type

using this file:

https://github.com/protocolbuffers/protoscope/files/14941355/hello.dmp

I get this result:

> protoscope hello.dmp
1: {12: 9.155902e11i32}   # 0x53552d6ei32

while with other methods I get this result:

protobuf.Message{
   protobuf.Field{Number:1, Type:2, Value:protobuf.Bytes("en-US")},
   protobuf.Field{Number:1, Type:2, Value:protobuf.Message{
      protobuf.Field{Number:12, Type:5, Value:protobuf.Fixed32(1398091118)},
   }},
}

I think it would be useful if protoscope didn't assume type, and just return bytes as is, and also as embedded message, if it parses as such

Hello, how can I restore.proto, I hope you can help me, thanks

hexdata.txt
08071093F4FF8E8FEE8DA36F18B84520082A150A0D636F6D70726573735F747970651204677A69702AA7020A0F696D2D696E7465726E616C5F657874129302696E7465726E616C5F7372633A707573687365727665727C6E6578745F637572736F723A742D313636383334363333353134395F722D373136353439323934323933313531383734395F642D315F752D315F682D315F6368642D5F6368752D5F7264632D317C7773735F707573685F726F6F6D5F69643A373136353434353939353034323534303332317C7773735F707573685F6469643A373131373233383836393531393137333136317C7773735F707573685F6C6F675F69643A383031383135373134343831333539393235317C7773735F66657463685F6D733A313636383334363333353036337C7773735F707573685F6D733A313636383334363333353134397C7773735F6D73675F747970653A722A4E0A09696D2D637572736F721241742D313636383334363333353134395F722D373136353439323934323933313531383734395F642D315F752D315F682D315F6368642D5F6368752D5F7264632D312A170A06696D2D6E6F77120D313636383334363333353134392A190A0E696D2D6C6976655F637572736F721207642D315F752D31320270623A036D736742CD0C1F8B08000000000000FFEC996D6C1C4719C7BD7BE7F83238C659DE8EF081C39110059D3DB3EF7B122F7612C7717C8D733E276E041DCDEDCEDDADEF6EF7B2BB77675F52845028980071F9108908439A8AA60A82A68E2AA586D22A6D0444F005099080482DCAE5DC22BE54CD4724743EC779BBA6711C43D5EE7E399D66E679669EF9CF3EBF7D06FC43011FDD4F533A71BD382DA4A813A7AE4B32947B41018B5CEBB6EEEF1D3F72F16FBFFAF5793DFCE4F1C77FF35CEDB9F33A64067EC9818FE5CD32C58E6D1730B53CEA60CF26AEC77DF2108C955CEA3C16B93AF7B32B0B8F1F4231D7734C2BF3D8962EB0716B4A4D2F3D91D940CF1B9B431FE23A41C75675DB0E657030321B38C3FC7933F8D3E650EDE8E5B77FC0761FBB367B32B0A53DD77822CCF06150CD7A5ED18DF5F51511EA35ECD2B469154DBD57B70B7DA4420BB40F413885206CFE8B9232F188D3E7D96E54B7A2A4EC4521441246BC64488452216D286949D2D322510824484784F01A85BD93459AF972DAB10B5F1421829280A0A09E61AA60BCE1BEE15D8E569AB1BA6D126621D3B7DCD2D70802CE38C4A0384FCB348FCB1246B0B76865BEEE15F3E5A89D9AEC350B2443C3DD910864063E15063D9F58FCFE538B6766EABF98A92F9C46B0BE70BA76F468FDE727E7998ED02CD3FDD930B3C0BCCA407082F9DABA4EE61483C1FEEBF6857BB0BF146F5C24D3CB4EF8298C2046622B0F9D914EC85C606A00FCA51D6C5C7C76AE76F689FA8923DDED61B6E7C57670890DB1DC45167C7535FED3C472F57C29757D7532CE9B99AC67D8150B4B2D2671C3BAB6AED6EF656F566B9D7BE89E47827FB2A1207799058F3EB8E56AAD16FCE8835BF0DDEDAF5D0CADEC735F58C5D8C406106C6BFB127760F997BBA161F8D4C2B91FBFCC7233ED21863BD20E3AEA0BA7CDABF33FEA6E0F333DFF0A82530D6DCFB120B1B6ED68A9B9C4DAB6E0EE36EF2FEC2DD5FBE977E90F161A9A7D9E05C9B506A9A592926B0DD3DDADDE6FA05AAAF233EF3AE20E2D5E571CBCFCFA6B975FB9F674C737D9B67976A3AAF0B2C60BB2F216ABC4C7C4CAC864FF407F7F7FFFAE8989BDA2B22F33B237313436EE68DB54C518DA35F1C823C9B896A866FB0B78F71E7938399631ECB381C09B81FF630E7C3340C1DED53827969EB51DDC881DF6480697F9168EE1D681AE9ECEFAFCCC95F3734D3FB341E654B0C91C97823B9BC17A58D5BD6482227B64FA40BC42C8CEA11D3495A6DE5425572D233B3E319E4DC2DDD0328431A3343831902A5813767A746CB0E059BBB769F9DC1033CA4CCCFE8EFD0E13E07CBEF940F38D9F92D73F25FB10F9002172157CE3A7EEFB4FDD3E1CDE031CFA7CF3DEE49BF0EFD951E687CCBF39F0F1E5E9DD5219328BDC432B75A137BE7162F1A56F470EA158D1A1E9953A11EFD7897C8EF239CAE7289FA37C14F0EB447E9DE883C8513D9F0B75715D37A38ECA5C603E023616A671DACEE7ED0AB7A1F6AD7357CF3DD3C3869867984078E57E0EBCBAE91DAEF79EDC045E0BBEC3F5DEA9E36FCDFFFD8EEBBDA7830FE07AEFF5C09DD8F6C700B81408D59F78F90FDFBD7691E9FEE9D9B75FDAB4A5B376EC4CEDD917AEBC78B236772CC20C57C1D45AF8CD8C421509584E0B342DA78921688A487455E591206B905745CD40446D4D6F0D807A9EE9DED400A8B60B0C00A19E0D20186AE3DA96CE4757356B1A25D332B325626AEA6D87C4AC56656747B9B82F3B8A73C2A83838BCDD8BE71E1E15F68C1CEC9FDCEE0D0D25154293467C687CE990FC0F757A4B8857E4BAA7E8C8B99D0707C64B245920C37B93B9E1EC417597551DD99FEEDF95B2C5C9B2E51DB0B6A752396744DB7730571A3C80C50445037A263A98C1E2CD654D7F5FDF3FFBBAFC39570BAEC7E79CFF5E78DFEB67F5696C250F71635E14C9B22A88B220485016B01355902C891AAF89BC262009A98AA861238A70298A70B6D92A8BB2A4421949A22C4111EB59238AF56C298A1D438FA2F06247E4AF3F7965E6B7F0F3FF61CD86882D92C7AEA3C78A2537EB52A74C9DC3169DF2B05E725CDB89ADC31C0E575C1737DC358F9269C496068992A64950E425110AFC4D9D8CA50E48819A067959B9D190B7338DC12A442A9214248A2A12244D51C5669F34F5F42C2EB8B15B16706378CB96829BC1DE7491C69CC315C77071AEECC696712161DB8504B172CBC0B01217A84189D74409A908DFD475CC239E7B7B5F81474841488248F8CA878798FF020000FFFF010000FFFFE0A2C71089240000
"xxd -r -ps hexdata.txt | protoscope"
1: 7 2: 8018157144813599251 3: 8888 4: 8 5: { 1: {"compress_type"} 2: {"gzip"} } 5: { 1: {"im-internal_ext"} 2: { "internal_src:pushserver|next_cursor:t-1668346335149_r-7165492942931518749_d-1_u-" "1_h-1_chd-_chu-_rdc-1|wss_push_room_id:7165445995042540321|wss_push_did:71172388" "69519173161|wss_push_log_id:8018157144813599251|wss_fetch_ms:1668346335063|wss_p" "ush_ms:1668346335149|wss_msg_type:r" } } 5: { 1: {13: 1.677723921655375e243} # 0x726f737275632d6di64 2: {"t-1668346335149_r-7165492942931518749_d-1_u-1_h-1_chd-_chu-_rdc-1"} } 5: { 1: {"im-now"} 2: {"1668346335149"} } 5: { 1: { 13: 4.7395440062453295e170 # 0x635f6576696c2d6di64 14: 4.742815e30i32 # 0x726f7372i32 } 2: {"d-1_u-1"} } 6: {14: 98} 7: {"msg"} 8: {1f8b08000000000000ffec996d6c1c4719c7bd7be7f83238c659de8ef081c39110059d3db3ef7b12 2f7612c7717c8d733e276e041dcdedceddadef6ef7b2bb77675f52845028980071f9108908439a8a a60a82a68e2aa586d22a6d0444f005099080482dcae5dc22be54cd4724743ec779bba6711c43d5ee 7e399d66e679669ef9cf3ebf7d06fc43011fdd4f533a71bd382da4a813a7ae4b32947b41018b5ceb b6eeef1d3f72f16fbffaf5793dfce4f1c77ff35cedb9f33a64067ec9818fe5cd32c58e6d1730b53c ea60cf26aec77df2108c955cea3c16b93af7b32b0b8f1f4231d7734c2bf3d8962eb0716b4a4d2f3d 91d940cf1b9b431fe23a41c75675db0e657030321b38c3fc7933f8d3e650ede8e5b77fc0761fbb36 7b32b0a53dd77822ccf06150cd7a5ed18df5f51511ea35ecd2b469154dbd57b70b7da4420bb40f41 3885206cfe8b9232f188d3e7d96e54b7a2a4ec4521441246bc64488452216d286949d2d322510824 484784f01a85bd93459af972dab10b5f1421829280a0a09e61aa60bce1bee15d8e569ab1ba6d1266 21d3b7dcd2d70802ce38c4a0384fcb348fcb1246b0b76865beee15f3e5a89d9aec350b2443c3dd91 0864063e15063d9f58fcfe538b6766eabf98a92f9c46b0be70ba76f468fde727e7998ed02cd3fdd9 30b3c0bcca407082f9daba4ee61483c1feebf6857bb0bf146f5c24d3cb4ef8298c2046622b0f9d91 4ec85c606a00fca51d6c5c7c76ae76f689fa8923dded61b6e7c57670890db1dc45167c7535fed3c4 72f57c29757d7532ce9b99ac67d8150b4b2d2671c3bab6aed6ef656f566b9d7be89e47827fb2a120 7799058f3eb8e56aad16fce8835bf0ddedaf5d0cadec735f58c5d8c406106c6bfb127760f997bba1 61f8d4c2b91fbfcc7233ed21863bd20e3aea0ba7cdabf33fea6e0f333dff0a82530d6dcfb120b1b6 ed68a9b9c4dab6e0ee36ef2fec2dd5fbe977e90f161a9a7d9e05c9b506a9a592926b0dd3ddadde6f a05aaaf233ef3ae20e2d5e571cbcfcfa6b975fb9f674c737d9b67976a3aaf0b2c60bb2f216abc4c7 c4cac864ff407f7f7fffae8989bda2b22f33b237313436ee68db54c518da35f1c823c9b896a866fb 0b78f71e7938399631ecb381c09b81ff630e7c3340c1ded53827969eb51ddc881df6480697f9168e e1d681ae9ecefafccc95f3734d3fb341e654b0c91c97823b9bc17a58d5bd6482227b64fa40bc42c8 cea11d3495a6de5425572d233b3e319e4dc2ddd0328431a3343831902a5813767a746cb0e059bbb7 69f9dc1033ca4cccfe8efd0e13e07cbef940f38d9f92d73f25fb10f9002172157ce3a7eefb4fdd3e 1cde031cfa7cf3dee49bf0efd951e687ccbf39f0f1e5e9dd5219328bdc432b75a137be7162f1a56f 470ea158d1a1e9953a11efd7897c8ef239cae7289fa37c14f0eb447e9de883c8513d9f0b75715d37 a38eca5c603e023616a671dacee7ed0ab7a1f6ad7357cf3dd3c3869867984078e57e0ebcbae91dae f79edc045e0bbec3f5dea9e36fcdfffd8eebbda7830fe07aeff5c09dd8f6c700b81408d59f78f90f dfbd7691e9fee9d9b75fdab4a5b376ec4cedd917aebc78b236772cc20c57c1d45af8cd8c42150958 4e0b342da78921688a487455e591206b905745cd40446d4d6f0d807a9ee9ded400a8b60b0c00a19e 0d20186ae3da96ce4757356b1a25d332b325626aea6d87c4ac56656747b9b82f3b8a73c2a83838bc dd8be71e1e15f68c1cec9fdcee0d0d25154293467c687ce990fc0f757a4b8857e4baa7e8c8b99d07 07c64b245920c37b93b9e1ec417597551dd99feedf95b2c5c9b2e51db0b6a752396744db7730571a 3c80c50445037a263a98c1e2cd654d7f5fdf3ffbbafc39570baec7e79cff5e78dfeb67f5696c250f 71635e14c9b22a88b220485016b01355902c891aaf89bc262009a98aa861238a70298a70b6d92a8b b2a4421949a22c4111eb59238af56c298a1d438fa2f06247e4af3f7965e6b7f0f3ff61cd86882d92 c7aea3c78a2537eb52a74c9dc3169df2b05e725cdb89adc31c0e575c1737dc358f9269c496068992 a64950e425110afc4d9d8ca50e48819a067959b9d190b7338dc12a442a9214248a2a12244d51c566 9f34f5f42c2eb8b15b16706378cb96829bc1de7491c69cc315c77071aeecc696712161db8504b172 cbc0b01217a84189d74409a908dfd475cc239e7b7b5f81474841488248f8ca878798ff020000ffff 010000ffffe0a2c71089240000}

Allow dumping varint-delimited protos

Currently, protoscope doesn't allow dumping or representing varint-delimited protobuf messages, which is used somewhat frequently in parts of the world. E.g., some official protobuf libraries has parseDelimitedFrom and writeDelimitedTo; Bazel uses it for query --output=streamed_proto and build --build_event_binary_file=.... It would be helpful to add support for these dumps, behind an option of course.

A simple extension to the protoscope language to allow representing varint-delimited protos could be the --- document separator used by YAML.

packed repeated fields

https://protobuf.dev/programming-guides/encoding#packed

using this file:

package main

import (
   "google.golang.org/protobuf/testing/protopack"
   "os"
)

func main() {
   inBytes := protopack.Message{
      protopack.Tag{3, protopack.BytesType}, protopack.LengthPrefix{
         protopack.Float32(1.1),
         protopack.Float32(2.2),
         protopack.Float32(3.3),
      },
   }.Marshal()
   os.WriteFile("packed.bin", inBytes, 0666)
}

https://google.golang.org/protobuf/testing/protopack#Message.Marshal

if I try to scan:

package main

import (
   "github.com/protocolbuffers/protoscope"
   "os"
)

func main() {
   inBytes, err := os.ReadFile("packed.bin")
   if err != nil {
      panic(err)
   }
   scanner := protoscope.NewScanner(string(inBytes))
   if _, err := scanner.Exec(); err != nil {
      panic(err)
   }
}

I get this result:

panic: unrecognized symbol "\x1a\f\xcď?\xcd\xcc\f@33S@"

the tool works:

> protoscope packed.bin
3: {`cdcc8c3fcdcc0c4033335340`}

but neither the module or tool seem to have an option for dealing with this. I check WriterOptions but didn't see anything:

https://pkg.go.dev/github.com/protocolbuffers/protoscope#WriterOptions

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.