Coder Social home page Coder Social logo

go-winmd's Introduction

Go Reference

Go winmd parser

A Windows Metadata (a.k.a. winmd) parser written in Go and based on the ECMA-335 6th edition standard.

Development References

These resources are useful as reference while working on the go-winmd module:

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

go-winmd's People

Contributors

dagood avatar microsoft-github-operations[bot] avatar microsoftopensource avatar qmuntal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-winmd's Issues

Improve `//sys` generation return values, names, and `failretval`

Generation is fairly basic right now: the return value is named r and if the method sets lasterr, we also generate err error.

  • Figure out better return value name. The winmd doesn't have names, so there's no clearly correct approach. Can we add a way to include human guidance for certain methods and/or return types?
  • Generation doesn't specify e.g. [failretval==-1] at all right now. This might also be tricky to automatically generate, but this needs some more investigation.

Improve performance by finding each special TypeRef index only once

There are a few places in code that runs repeatedly that read a TypeRef/TypeDef/Module then check its name and namespace to see if it matches some well-known values, like System.Enum.

I believe we could instead find the System.Enum TypeRef only once (in NewContext), store its index, and compare against that later.

Improve unresolvable TypeRef handling (System.Guid)

Some types aren't in the current module, like System.Guid. Currently the tool logs them so the dev can implement them in another Go file in the package, but we could do better:

  • Put the list of unresolvable types at the top of the generated file (for more visibility).
  • Write code that explicitly checks that the type is implemented.

Resolving the types (across multiple winmd files?) is best, but I think some basic types will always need some manually defined types.

Add configurability to match usage

As of writing, the prototype accepts a text/template file to fill in and write to disk, but all of the generated code is given to it as a single piece of data. It might make sense to improve on that, or maybe scrap the template for some other approach. With program arguments for things like package name, I think it wouldn't be hard to have the generator simply create the full file.

The way methods are filtered could also be improved. In the prototype, you pass a single regex. (| works.) My concern is that it could get unwieldy. A few ways forward:

  • Use multiple go generate directives per distinct set of APIs.
    • Performance problem: the prototype scans some full tables to index them in memory, which would be repeated.
  • Scan a Go file for //winmdsigs ... comments that tell the generator what to do. (Like //sys comments.)

Support union types

Currently when the generator finds a union type (struct with multiple FieldOffset=0), it takes the first option. This seems to roughly line up with how some of the syscalls in x/sys/windows have done it. Go doesn't have anything that can represent union types all that nicely, but there are a few things I think we can do:

  • Add methods on the generated structs that do the necessary pointer work to assign other unioned types to the underlying values.
  • Check if we can pick the "default" union option in a smarter way than simply using the first option.

Support data in `.param`

The code currently throws an error unsupported method: expected param row with sequence 0 to be empty if there's data in .param, because we haven't seen any in winmd files yet.

I don't know of any reason to support this yet, but if we (or anyone) ends up finding one, we can discuss it here.

When parsing params, handle gaps in `Sequence` values

§II.22.33 says:

  1. Successive rows of the Param table that are owned by the same method shall be
    ordered by increasing Sequence value - although gaps in the sequence are allowed
    [WARNING]

We haven't seen a gap yet, so I'm not sure what's necessary to handle it. Filing this issue for reference--we know we're currently making the assumption that there aren't gaps.

Add Property signature parsing

Add PropertySignature alongside FieldSignature and MethodDefSignature. It doesn't seem necessary for syscall generation, but for more general use the library will probably need this.

Improve enum naming: reduce excessive length

For each enum, if any enum entry doesn't use the enum name as a prefix, then the generator adds the full enum name as a prefix for every entry. This results in some pretty lengthy names with repeated words:

type CRYPT_XML_TRANSFORM_FLAGS uint32

const (
	CRYPT_XML_TRANSFORM_FLAGS_CRYPT_XML_TRANSFORM_ON_STREAM        CRYPT_XML_TRANSFORM_FLAGS = 0x1
	CRYPT_XML_TRANSFORM_FLAGS_CRYPT_XML_TRANSFORM_ON_NODESET       CRYPT_XML_TRANSFORM_FLAGS = 0x2
	CRYPT_XML_TRANSFORM_FLAGS_CRYPT_XML_TRANSFORM_URI_QUERY_STRING CRYPT_XML_TRANSFORM_FLAGS = 0x3
)

type CERT_QUERY_CONTENT_TYPE uint32

const (
	CERT_QUERY_CONTENT_TYPE_CERT_QUERY_CONTENT_CERT               CERT_QUERY_CONTENT_TYPE = 0x1
	CERT_QUERY_CONTENT_TYPE_CERT_QUERY_CONTENT_CTL                CERT_QUERY_CONTENT_TYPE = 0x2
...
	CERT_QUERY_CONTENT_TYPE_CERT_QUERY_CONTENT_CERT_PAIR          CERT_QUERY_CONTENT_TYPE = 0xd
	CERT_QUERY_CONTENT_TYPE_CERT_QUERY_CONTENT_PFX_AND_LOAD       CERT_QUERY_CONTENT_TYPE = 0xe
)

Hopefully we can find some good compromise. Maybe look for common substrings between the enum name and enum members, rather than just the full prefix being there?

Improve nested type `resolveTypeRef` performance with lookups

When resolveTypeRef looks up a nested type, it traverses up then down the tree of nested types to find the resolvedDef. This might be able to be improved for a few cases:

  • When calling resolveTypeRef on multiple refs in the same nested type tree, you end up repeating the same traversal for any nodes in the tree that the refs share. We could cache TypeRef index -> resolvedDef to reduce repeated operations.
  • When calling resolveTypeRef on a deep nested tree, the upward traversal is fine because there is only one child -> parent link. The downward traversal is 1 parent -> n children, and currently it loops through the children to find the Name match. We could use a map of name -> child to avoid the loop.

Nested TypeRef chains and the number of children seem pretty small in the cases I've seen so far, so I wonder if the overhead would be slower than the optimization.

Generate members for interface types

Currently, interface types are generated as empty structs with a comment saying // Interface type is likely missing members. Not yet implemented in go-winmd.. We need to figure out how syscalls expect these to be defined and generate them.

Use param flags to improve code generation (`[In]`, `[Out]`, etc.)

Currently the code doesn't check the param flags. They might be useful information for parameter naming, how errors are interpreted, some heuristic, or maybe to improve the API in some unknown way.

There's no specific goal for this issue, just a reminder to consider the param flags if we want/need to improve the code generation.

Make it easier to understand genwinmdsigs decisions

In the first iteration of //sys generation, I included /* */ comments inside the //sys line for some info that I wasn't using yet: the flags on params like In and Out. I removed them because comments aren't valid in this context, and it keeps things simpler to not include them. But, this kind of information can be useful for diagnosis, so we should find some way to fit it back in.

A verbose logging mode might be enough. If something seems wrong, the dev would re-run genwinmdsigs with -verbose and a stricter -filter (to avoid log spam). If there's a problem that doesn't repro without generating the full set of methods, it might be tough to sift through the data, but I'm not sure generating Go comments is necessarily any better.

Support architecture-specific methods and types

Some functions and types have the SupportedArchitecture attribute, which specifies in which architecture they are defined. It can even be the case that two methods have different signatures but the same name when defined for different architectures. See for example RtlLookupFunctionEntry or RtlCaptureContext.

Increase mkwinsyscall arg count limit

Currently mkwinsyscall only accepts 15 args, calling Syscall15. Go has had Syscall18 for ~4 years, and SyscallN since 1.18. When trying to generate all methods in Windows.Win32.winmd, with some temp logging added in I see these exceeding the limit that mkwinsyscall has inside it:

CreateFontPackage: parameter 15
CreateFontPackage: parameter 16
ICDrawBegin: parameter 15
DRMGetUsagePolicy: parameter 15
ScriptShapeOpenType: parameter 15
ScriptPlaceOpenType: parameter 15
ScriptPlaceOpenType: parameter 16
ScriptPlaceOpenType: parameter 17
D2D1GetGradientMeshInteriorPointsFromCoonsPatch: parameter 15
AccessCheckByTypeAndAuditAlarmW: parameter 15
AccessCheckByTypeResultListAndAuditAlarmW: parameter 15
AccessCheckByTypeResultListAndAuditAlarmByHandleW: parameter 15
AccessCheckByTypeResultListAndAuditAlarmByHandleW: parameter 16
AccessCheckByTypeAndAuditAlarmA: parameter 15
AccessCheckByTypeResultListAndAuditAlarmA: parameter 15
AccessCheckByTypeResultListAndAuditAlarmByHandleA: parameter 15
AccessCheckByTypeResultListAndAuditAlarmByHandleA: parameter 16

I don't know if any of those methods are particularly important, but I don't know of a reason we shouldn't support them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.