Coder Social home page Coder Social logo

fabianoliver / vbprojectparser Goto Github PK

View Code? Open in Web Editor NEW
7.0 2.0 9.0 309 KB

.Net library to extract/decompress/parse the VBProject.bin from binary Excel files. Allows to e.g. read VBA code from Excel files without any dependency on Excel itself.

C# 100.00%

vbprojectparser's Introduction

VbProjectParser

C# library to read Vba code / VBA Project information from VBProject.bin or regular Excel files. It has no requirements on MS Excel being installed on machine.

This project demonstrates how to read basic information from VBProject.bin binary files, e.g. to extract VBA Code & module information as plain text.

It contains an integration project with OpenXML, so the same can be done for regular Excel files without manually having to extract the VBProject files. As a result, it is possible to e.g. read VBA Code from Excel files without opening these in Excel (or even having Excel installed on the machine).

All code is full C#. This project is work in progress, please report issues on the Github issue tracker.

vbprojectparser's People

Contributors

fabianoliver avatar rogergrambihler avatar

Stargazers

Roman Zenka avatar  avatar  avatar Felix Bayer avatar Sylvain Bruyere avatar Alexander Rickman avatar  avatar

Watchers

 avatar  avatar

vbprojectparser's Issues

Unable to read VBA if contains a PROJECTCOMPATVERSION record.

Newer versions of Vbproject code have a PROJECTCOMPATVERSION record that the current projectinformation code doesn't check for an will fail to read in the vba. See: https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-ovba/ed5d7ede-5d7d-4645-bba3-ddfd9bdc76ed

Minimal fix for this is to check for the existence of the record before checking for the PROJECTLCID. Another fix could be to instead of the code relying on the records in a specific order is to loop checking the record type and then calling the appropriate code based on the record type. Could also then skip unknown record types.

I'd also recommend upgrading the code to .net 4.8 since the .net built against is out of support
@fabianoliver - I can do separate pull requests for handing the projectcompatversion record and upgrading to .net
4.8 depending on your feedback on the best way to approach.

REFERENCEPROJECT validating version against current Project causes exception

The REFERENCEPROJECT validates that the version of the referenced project matches the current project version with the code.

REFERENCEPROJECT.cs
[ValueMustEqualMember("ProjectInformation.VersionRecord.VersionMajor")]
public readonly UInt32 MajorVersion;

    [ValueMustEqualMember("ProjectInformation.VersionRecord.VersionMinor")]
    public readonly UInt16 MinorVersion;

The referenced project is not required to have the same MajorVersion and MinorVersion and the code results in a validation error.

The fix could be to just remove the ValueMustEqualMember.

Reflection has already validated the values are UInt16 and I cannot think of any additional validation that could be performed here.

Crash if CompressionContainer contains more than one CompressedChunk

When a file is read with multiple CompressedChunks an exception will occur:
System.FormatException: 'Signature byte expected 0x03,

This is caused by the size of data to be read from the CompressedChunk the CompressionContainer is incorrect.
From: CompressedChunkData.cs, line#47
var size = Math.Min(header.CompressedChunkSize, Data.Length - Data.i);

The header.CompressedChunkSize is being used as the remaining bytes to be read in from the stream. Since the first two bytes have already been read from the chunk the proper calculation is:
var size = Math.Min(header.CompressedChunkSize -2, Data.Length - Data.i);

MSFT documents how to read the chunk at:
https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-ovba/3d5ea4df-e8a5-4079-a454-9595b840525f

Feature Request: Be able to open a VBAProject in a Word file

I'd like to be able to open a Word document's VBA stream. Currently the code only supports Excel in the VbProject class.

I see two possible options for this.

  1. Make an overload VbProjectWord class that will inherit and keep the public signature of VbProject except will work with a Word File.

  2. Add a static on the VbProject class such as VbProject.LoadWordDocumentVbProject(string pathToWordFile) that would open the Word File.

I think option #1 fits in better with the current code when looking but I'm also not a fan of doing things such as opening files and other expected failures within a class constructor. It may also be nice if think in future would have app specific methods that would make sense on the VbProject class level.

Option #2 would have a static called but it is confusing with the current class that also has a constructors for both WorkbookPart and another for the fileFullPath which it assumes is to an Excel file.

I'm leaning toward coding up option #1 and then can review what looks like.

thoughts?

Exception Parsing ProjectStream if it contains more than one entry under [Host Extender Info]

An InvalidOperationException will occur when parsing a ProjectStream contains more than one [Host Extender Info] or if the entry libname does not start with VBE.

Example of [HostExtenderInfo] that will reproduce both issues:

[Host Extender Info]
&H00000001={3832D640-CF90-11CF-8E43-00A0C911005A};VBE;&H00000000
&H00000002={00020818-0000-0000-C000-000000000046};Excel8.0;&H00000000

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.