Coder Social home page Coder Social logo

pcodedmp's Introduction

pcodedmp.py - A VBA p-code disassembler

Introduction

It is not widely known, but macros written in VBA (Visual Basic for Applications; the macro programming language used in Microsoft Office) exist in three different executable forms, each of which can be what is actually executed at run time, depending on the circumstances. These forms are:

  • Source code. The original source code of the macro module is compressed and stored at the end of the module stream. This makes it relatively easy to locate and extract and most free DFIR tools for macro analysis like oledump or olevba or even many professional anti-virus tools look only at this form. However, most of the time the source code is completely ignored by Office. In fact, it is possible to remove the source code (and therefore make all these tools think that there are no macros present), yet the macros will still execute without any problems. I have created a proof of concept illustrating this. Most tools will not see any macros in the documents in this archive it but if opened with the corresponding Word version (that matches the document name), it will display a message and will launch calc.exe. It is surprising that malware authors are not using this trick more widely.

  • P-code. As each VBA line is entered into the VBA editor, it is immediately compiled into p-code (a pseudo code for a stack machine) and stored in a different place in the module stream. The p-code is precisely what is executed most of the time. In fact, even when you open the source of a macro module in the VBA editor, what is displayed is not the decompressed source code but the p-code decompiled into source. Only if the document is opened under a version of Office that uses a different VBA version from the one that has been used to create the document, the stored compressed source code is re-compiled into p-code and then that p-code is executed. This makes it possible to open a VBA-containing document on any version of Office that supports VBA and have the macros inside remain executable, despite the fact that the different versions of VBA use different (incompatible) p-code instructions.

  • Execodes. When the p-code has been executed at least once, a further tokenized form of it is stored elsewhere in the document (in streams, the names of which begin with __SRP_, followed by a number). From there it can be executed much faster. However, the format of the execodes is extremely complex and is specific for the particular Office version (not VBA version) in which they have been created. This makes them extremely non-portable. In addition, their presence is not necessary - they can be removed and the macros will run just fine (from the p-code).

Since most of the time it is the p-code that determines what exactly a macro would do (even if neither source code, nor execodes are present), it would make sense to have a tool that can display it. This is what prompted us to create this VBA p-code disassembler.

Installation

The script will work both in Python version 2.6+ and in Python 3.x. The simplest way to install it is from PyPi with pip:

pip install pcodedmp -U

The above command will install the latest version of pcodedmp (upgrading an older one if it already exists), while also installing all the necessary dependencies (currently only oletools and win_unicode_console but there might be additional ones in the future).

If you would rather install it from the GitHub repository, you can do it like this:

git clone https://github.com/bontchev/pcodedmp.git
cd pcodedmp
pip install .

Usage

The script takes as a command-line argument a list of one or more names of files or directories. If the name is an OLE2 document, it will be inspected for VBA code and the p-code of each code module will be disassembled. If the name is a directory, all the files in this directory and its subdirectories will be similarly processed. In addition to the disassembled p-code, by default the script also displays the parsed records of the dir stream, as well as the identifiers (variable and function names) used in the VBA modules and stored in the _VBA_PROJECT stream.

The script supports VBA5 (Office 97, MacOffice 98), VBA6 (Office 2000 to Office 2009) and VBA7 (Office 2010 and higher).

The script also accepts the following command-line options:

-h, --help Displays a short explanation how to use the script and what the command-line options are.

-v, --version Displays the version of the script.

-n, --norecurse If a name specified on the command line is a directory, process only the files in this directory; do not process the files in its subdirectories.

-d, --disasmonly Only the p-code will be disassembled, without the parsed contents of the dir stream or the identifiers in the _VBA_PROJECT stream.

-b, --verbose The contents of the dir and _VBA_PROJECT streams is dumped in hex and ASCII form. In addition, the raw bytes of each compiled into p-code VBA line are also dumped in hex and ASCII.

-o OUTFILE, --output OUTFILE Save the results to the specified output file, instead of sending it to the standard output.

For instance, using the script on one of the documents in the proof of concept mentioned above produces the following results:

python pcodedmp.py -d Word2013.doc

Processing file: Word2013.doc
===============================================================================
Module streams:
Macros/VBA/ThisDocument - 1517 bytes
Line #0:
        FuncDefn (Private Sub Document_Open())
Line #1:
        LitStr 0x001D "This could have been a virus!"
        Ld vbOKOnly
        Ld vbInformation
        Add
        LitStr 0x0006 "Virus!"
        ArgsCall MsgBox 0x0003
Line #2:
        LitStr 0x0008 "calc.exe"
        Paren
        ArgsCall Shell 0x0001
Line #3:
        EndSub

For reference, it is the result of compiling the following VBA code:

Private Sub Document_Open()
    MsgBox "This could have been a virus!", vbOKOnly + vbInformation, "Virus!"
    Shell("calc.exe")
End Sub

Known problems

  • Office 2016 64-bit only: When disassembling variables declared as being of custom type (e.g., Dim SomeVar As SomeType), the type (As SomeType) is not disassembled.

  • Office 2016 64-bit only: The Private property of Sub, Function and Property declarations is not disassembled.

  • Office 2016 64-bit only: The Declare part of external function declarations (e.g., Private Declare PtrSafe Function SomeFunc Lib "SomeLib" Alias "SomeName" () As Long) is not disassembled.

  • Office 2000 and higher: The type of a subroutine or function argument of type ParamArray is not disassembled correctly. For instance, Sub Foo (ParamArray arg()) will be disassembled as Sub Foo (arg).

  • All versions of Office: The Alias "SomeName" part of external function declarations (e.g., Private Declare PtrSafe Function SomeFunc Lib "SomeLib" Alias "SomeName" () As Long) is not disassembled.

  • All versions of Office: The Public property of custom type definitions (e.g., Public Type SomeType) is not disassembled.

  • All versions of Office: The custom type of a subroutine or function argument is not disassembled correctly and CustomType is used instead. For instance, Sub Foo (arg As Bar) will be disassembled as Sub Foo (arg As CustomType).

  • If the output of the program is sent to a file, instead of to the console (either by using the -o option or by redirecting stdout), any non-ASCII strings (like module names, texsts used in the macros, etc.) might not be properly encoded.

I do not have access to 64-bit Office 2016 and the few samples of documents, generated by this version of Office, that I have, have been insufficient for me to figure out where the corresponding information resides. I know where it resides in the other versions of Office, but it has been moved elsewhere in 64-bit Office 2016 and the old algorithms no longer work.

To do

  • Implement support of VBA3 (Excel95).

  • While the script should support documents created by MacOffice, this has not been tested (and you know how well untested code usually works). This should be tested and any bugs related to it should be fixed.

  • I am not an experienced Python programmer and the code is ugly. Somebody more familiar with Python than me should probably rewrite the script and make it look better.

Change log

Version 1.2.6:

  • Changed it not to require the win_unicode_console module when it is not available - e.g., when not running on a Windows machine or when running under the PyPy implementation of Python, thanks to Philippe Lagadec.

Version 1.2.5:

  • Added a sanity check to avoid errors when parsing object declarations
  • The functions that produce output now have the output file (default is stdout) as a parameter, for better integration with other tools, thanks to Philippe Lagadec.

Version 1.2.4:

  • Implemented support for module names with non-ASCII characters in their names. Thanks to Philippe Lagadec for helping me with that.
  • Fixed a parsing error when disassembling object declarations.
  • Removed some unused variables.
  • Improved the documentation a bit.

Version 1.2.3:

  • Fixed a few crashes and documented better some disassembly failures.
  • Converted the script into a package that can be installed with pip. Use the command pip install pcodedmp.

Version 1.2.2:

  • Implemented handling of documents saved in Open XML format (which is the default format of Office 2007 and higher) - .docm, .xlsm, .pptm.

Version 1.2.1:

  • Now runs under Python 3.x too.
  • Improved support of 64-bit Office documents.
  • Implemented support of some VBA7-specific features (Friend, PtrSafe, LongPtr).
  • Improved the disassembling of Dim declarations.

Version 1.2.0:

  • Disassembling the various declarations (New, Type, Dim, ReDim, Sub, Function, Property).

Version 1.1.0:

  • Storing the opcodes in a more efficient manner.
  • Implemented VBA7 support.
  • Implemented support for documents created by the 64-bit version of Office.

Version 1.0.0:

  • Initial version.

pcodedmp's People

Contributors

bontchev avatar decalage2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pcodedmp's Issues

license question

Hello bontchev

Trying to use your library as part of my work.
Have to say, Nice Work and thanks for making it open source!

My question is whether you would agree to change the license
To TIM or LGPL
That way I will can use it.

Website installation instructions are incomplete

The Installation section of the current web site mentions installing the pre-requisite oletools but does not mention actually installing pcodedmp this could lead to inexperienced or even experienced but time challenged users failing to install it.

Please consider adding pcodedmp to the two pip lines given or having a separate line of: pip install -U pcodedmp

Wrong object names

The code is using the wrong object names. And more specifically, it's choosing the name of the previous object as the current one, so all the names of the objects / functions in the macro are wrong.
Maybe the (empty macro) is making the algorithm confused.

-------------------------------------------------------------------------------
VBA MACRO Baouaxfdk.cls 
in file: ./1/82606821.doc - OLE stream: 'Macros/VBA/Baouaxfdk'
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
(empty macro)
-------------------------------------------------------------------------------
VBA MACRO Geweminsodx.bas 
in file: ./1/82606821.doc - OLE stream: 'Macros/VBA/Geweminsodx'
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Function Zmflvlba()
On Error Resume Next
   Dim bYDMLFjw, OXvnd
Dim bHGnFxjFH()
ReDim bHGnFxjFH(1)
bHGnFxjFH(0) = "Sergio"

Dim JzLZO()
ReDim JzLZO(2)
JzLZO(0) = "Credit Card Account"
JzLZO(1) = "Adamczak - Chmielewski"

Dim PGcpCJVwA()
Line #0:
	FuncDefn (Function Geweminsodx())
Line #1:
	OnError (Resume Next) 
Line #2:
	Dim 
	VarDefn Zmflvlba
	VarDefn bYDMLFjw
Line #3:
	Dim 
	VarDefn OXvnd
Line #4:
	OptionBase 
	LitDI2 0x0001 
	Redim OXvnd 0x0001 (As Variant)
Line #5:
	LitStr 0x0006 "Sergio"
	LitDI2 0x0000 
	ArgsSt OXvnd 0x0001 
Line #6:
Line #7:
	Dim 
	VarDefn bHGnFxjFH
Line #8:
	OptionBase 
	LitDI2 0x0002 
	Redim bHGnFxjFH 0x0001 (As Variant)
Line #9:
	LitStr 0x0013 "Credit Card Account"
	LitDI2 0x0000 
	ArgsSt bHGnFxjFH 0x0001 
Line #10:
	LitStr 0x0016 "Adamczak - Chmielewski"
	LitDI2 0x0001 
	ArgsSt bHGnFxjFH 0x0001 
Line #11:

Error: unpack_from requires a buffer of at least 4 bytes

The attached sample triggers an error "unpack_from requires a buffer of at least 4 bytes" when parsing it with pcodedmp 1.2.6.

ab161b4bb9af46dd5f283288c0a5ca796fbb4cacf963db2bf1e3619aab2a3b12.zip
password: infected

Full output:

pcodedmp ab161b4bb9af46dd5f283288c0a5ca796fbb4cacf963db2bf1e3619aab2a3b12
Processing file: ab161b4bb9af46dd5f283288c0a5ca796fbb4cacf963db2bf1e3619aab2a3b12
===============================================================================
dir stream: _VBA_PROJECT_CUR/VBA/dir
-------------------------------------------------------------------------------
dir stream after decompression:
1080 bytes
dir stream parsed:
00000000:  PROJ_SYSKIND:
00000000   03 00 00 00                                        ....

0000000A:  PROJ_LCID:
00000000   09 04 00 00                                        ....

00000014:  PROJ_LCIDINVOKE:
00000000   09 04 00 00                                        ....

0000001E:  PROJ_CODEPAGE:
00000000   E4 04                                              ..

00000026:  PROJ_NAME:
00000000   56 42 41 50 72 6F 6A 65 63 74                      VBAProject

00000036:  PROJ_DOCSTRING
0000003C:  PROJ_UNICODE_DOCSTRING
00000042:  PROJ_HELPFILE
00000048:  PROJ_UNICODE_HELPFILE
0000004E:  PROJ_HELPCONTEXT:
00000000   00 00 00 00                                        ....

00000058:  PROJ_LIBFLAGS:
00000000   00 00 00 00                                        ....

00000062:  PROJ_VERSION:
00000000   98 EC 15 60 01 00                                  ...`..

0000006E:  PROJ_CONSTANTS
00000074:  PROJ_UNICODE_CONSTANTS
0000007A:  PROJ_REFNAME_PROJ:
00000000   73 74 64 6F 6C 65                                  stdole

00000086:  PROJ_UNICODE_REFNAME_PROJ:
00000000   73 00 74 00 64 00 6F 00 6C 00 65 00                s.t.d.o.l.e.

00000098:  PROJ_LIBID_REGISTERED:
00000000   5E 00 00 00 2A 5C 47 7B 30 30 30 32 30 34 33 30    ^...*\G{00020430
00000010   2D 30 30 30 30 2D 30 30 30 30 2D 43 30 30 30 2D    -0000-0000-C000-
00000020   30 30 30 30 30 30 30 30 30 30 34 36 7D 23 32 2E    000000000046}#2.
00000030   30 23 30 23 43 3A 5C 57 69 6E 64 6F 77 73 5C 53    0#0#C:\Windows\S
00000040   79 73 74 65 6D 33 32 5C 73 74 64 6F 6C 65 32 2E    ystem32\stdole2.
00000050   74 6C 62 23 4F 4C 45 20 41 75 74 6F 6D 61 74 69    tlb#OLE Automati
00000060   6F 6E 00 00 00 00 00 00                            on......

00000106:  PROJ_REFNAME_PROJ:
00000000   4F 66 66 69 63 65                                  Office

00000112:  PROJ_UNICODE_REFNAME_PROJ:
00000000   4F 00 66 00 66 00 69 00 63 00 65 00                O.f.f.i.c.e.

00000124:  PROJ_LIBID_REGISTERED:
00000000   94 00 00 00 2A 5C 47 7B 32 44 46 38 44 30 34 43    ....*\G{2DF8D04C
00000010   2D 35 42 46 41 2D 31 30 31 42 2D 42 44 45 35 2D    -5BFA-101B-BDE5-
00000020   30 30 41 41 30 30 34 34 44 45 35 32 7D 23 32 2E    00AA0044DE52}#2.
00000030   30 23 30 23 43 3A 5C 50 72 6F 67 72 61 6D 20 46    0#0#C:\Program F
00000040   69 6C 65 73 5C 43 6F 6D 6D 6F 6E 20 46 69 6C 65    iles\Common File
00000050   73 5C 4D 69 63 72 6F 73 6F 66 74 20 53 68 61 72    s\Microsoft Shar
00000060   65 64 5C 4F 46 46 49 43 45 31 36 5C 4D 53 4F 2E    ed\OFFICE16\MSO.
00000070   44 4C 4C 23 4D 69 63 72 6F 73 6F 66 74 20 4F 66    DLL#Microsoft Of
00000080   66 69 63 65 20 31 36 2E 30 20 4F 62 6A 65 63 74    fice 16.0 Object
00000090   20 4C 69 62 72 61 72 79 00 00 00 00 00 00           Library......

000001C8:  PROJ_MODULECOUNT:
00000000   05 00                                              ..

000001D0:  PROJ_COOKIE:
00000000   40 61                                              @a

000001D8:  MOD_NAME:
00000000   4D 6F 64 75 6C 65 31                               Module1

000001E5:  MOD_UNICODE_NAME:
00000000   4D 00 6F 00 64 00 75 00 6C 00 65 00 31 00          M.o.d.u.l.e.1.

000001F9:  MOD_STREAM:
00000000   4D 6F 64 75 6C 65 31                               Module1

00000206:  MOD_UNICODESTREAM:
00000000   4D 00 6F 00 64 00 75 00 6C 00 65 00 31 00          M.o.d.u.l.e.1.

0000021A:  MOD_DOCSTRING
00000220:  MOD_UNICODE_DOCSTRING
00000226:  MOD_TEXTOFFSET:
00000000   17 03 00 00                                        ....

00000230:  MOD_HELPCONTEXT:
00000000   00 00 00 00                                        ....

0000023A:  MOD_COOKIETYPE:
00000000   C4 62                                              .b

00000242:  MOD_FBASMOD_StdMods
00000248:  MOD_END
0000024E:  MOD_NAME:
00000000   54 68 69 73 57 6F 72 6B 62 6F 6F 6B                ThisWorkbook

00000260:  MOD_UNICODE_NAME:
00000000   54 00 68 00 69 00 73 00 57 00 6F 00 72 00 6B 00    T.h.i.s.W.o.r.k.
00000010   62 00 6F 00 6F 00 6B 00                            b.o.o.k.

0000027E:  MOD_STREAM:
00000000   54 68 69 73 57 6F 72 6B 62 6F 6F 6B                ThisWorkbook

00000290:  MOD_UNICODESTREAM:
00000000   54 00 68 00 69 00 73 00 57 00 6F 00 72 00 6B 00    T.h.i.s.W.o.r.k.
00000010   62 00 6F 00 6F 00 6B 00                            b.o.o.k.

000002AE:  MOD_DOCSTRING
000002B4:  MOD_UNICODE_DOCSTRING
000002BA:  MOD_TEXTOFFSET:
00000000   78 10 00 00                                        x...

000002C4:  MOD_HELPCONTEXT:
00000000   00 00 00 00                                        ....

000002CE:  MOD_COOKIETYPE:
00000000   33 80                                              3.

000002D6:  MOD_FBASMOD_Classes
000002DC:  MOD_END
000002E2:  MOD_NAME:
00000000   53 68 65 65 74 31                                  Sheet1

000002EE:  MOD_UNICODE_NAME:
00000000   53 00 68 00 65 00 65 00 74 00 31 00                S.h.e.e.t.1.

00000300:  MOD_STREAM:
00000000   53 68 65 65 74 31                                  Sheet1

0000030C:  MOD_UNICODESTREAM:
00000000   53 00 68 00 65 00 65 00 74 00 31 00                S.h.e.e.t.1.

0000031E:  MOD_DOCSTRING
00000324:  MOD_UNICODE_DOCSTRING
0000032A:  MOD_TEXTOFFSET:
00000000   33 03 00 00                                        3...

00000334:  MOD_HELPCONTEXT:
00000000   00 00 00 00                                        ....

0000033E:  MOD_COOKIETYPE:
00000000   A9 ED                                              ..

00000346:  MOD_FBASMOD_Classes
0000034C:  MOD_END
00000352:  MOD_NAME:
00000000   53 68 65 65 74 32                                  Sheet2

0000035E:  MOD_UNICODE_NAME:
00000000   53 00 68 00 65 00 65 00 74 00 32 00                S.h.e.e.t.2.

00000370:  MOD_STREAM:
00000000   53 68 65 65 74 32                                  Sheet2

0000037C:  MOD_UNICODESTREAM:
00000000   53 00 68 00 65 00 65 00 74 00 32 00                S.h.e.e.t.2.

0000038E:  MOD_DOCSTRING
00000394:  MOD_UNICODE_DOCSTRING
0000039A:  MOD_TEXTOFFSET:
00000000   33 03 00 00                                        3...

000003A4:  MOD_HELPCONTEXT:
00000000   00 00 00 00                                        ....

000003AE:  MOD_COOKIETYPE:
00000000   C4 CE                                              ..

000003B6:  MOD_FBASMOD_Classes
000003BC:  MOD_END
000003C2:  MOD_NAME:
00000000   53 68 65 65 74 33                                  Sheet3

000003CE:  MOD_UNICODE_NAME:
00000000   53 00 68 00 65 00 65 00 74 00 33 00                S.h.e.e.t.3.

000003E0:  MOD_STREAM:
00000000   53 68 65 65 74 33                                  Sheet3

000003EC:  MOD_UNICODESTREAM:
00000000   53 00 68 00 65 00 65 00 74 00 33 00                S.h.e.e.t.3.

000003FE:  MOD_DOCSTRING
00000404:  MOD_UNICODE_DOCSTRING
0000040A:  MOD_TEXTOFFSET:
00000000   33 03 00 00                                        3...

00000414:  MOD_HELPCONTEXT:
00000000   00 00 00 00                                        ....

0000041E:  MOD_COOKIETYPE:
00000000   43 05                                              C.

00000426:  MOD_FBASMOD_Classes
0000042C:  MOD_END
00000432:  PROJ_EOF
-------------------------------------------------------------------------------
_VBA_PROJECT stream:
5156 bytes
Identifiers:

0000: Excel
0001: VBA
0002: Win16
0003: Win32
0004: Win64
0005: Mac
0006: VBA6
0007: VBA7
0008: VBAProject
0009: stdole
000A: Office
000B: Module1
000C: _Evaluate
000D: book
000E: ThisWorkbook
000F: Sheet1
0010: Sheet2
0011: Sheet3
0012: Workbook
0013: Workbook_Open
0014: LSOHXYJXZHMWDWPDOOCSCUWJWCYIHSDPLJPXFQKOBXMNENQUPFJKJLWBETZZHJKGTBWPGHRGIPNFLXXLQWKUKVFRFKHQITLQTRXGNRSWIGUVMPYDXNLLKNFJMUIIPKMIUJYXOISOJQVMMGGTXESCRE
0015: Chr
0016: NSNMJRQCTYCZGOOZZFQHVEUXHLGVTTSUNKNDQJRSUQDRGYWQBVRYEUUOHVGMTKTMOBVTQZYKUHKIOPWIINSPEMVGPMHWCCBDVSVLRRYBDYLSOHXYJXZHMWDWPDOOCSCUWJWCYIHSDPLJPXFQKOBXMNENQUPFJKJLWBETZ
0017: ZHJKGTBWPGHRGIPNFLXXLQWKUKVFRFKHQITLQTRXGNRSWIGUVMPYDXNLLKNFJMUIIPKMIUJYXOISOJQVMMGGTXESCRENSNMJRQCTYCZGOOZZFQHVEUXHLGVTTSUNKNDQJRSUQDRGYWQBVRYEUUOHVGMTKTMOBVT
0018: CreateObject
0019: SpecialFolders
001A: ITLQTRXGNRSWIGUVMPYDXNLLKNFJMUIIPKMIUJYXOISOJQVMMGGTXESCRENSNMJRQCTYCZGOOZZFQHVEUXHLGVTTSUNKNDQJRSUQDRGYWQBVRYEUUOHVGMTKTMOBVTQZYKUHKIOPWIINSPEMVGPMHWCCBDVSVLRRYBDYLSOHXYJXZHMWDW
001B: JXZHMWDWPDOOCSCUWJWCYIHSDPLJPXFQKOBXMNENQUPFJKJLWBETZZHJKGTBWPGHRGIPNFLXXLQWKUKVFRFKHQITLQTRXGNRSWIGUVMPYDXNLLKNFJMUIIPKMIUJYXOISOJQVMMGGTXESCRENSNMJRQCTYCZGOOZZFQHVEUXHLGVTTSUNKNDQJR
001C: XHLGVTTSUNKNDQJRSUQDRGYWQBVRYEUUOHVGMTKTMOBVTQZYKUHKIOPWIINSPEMVGPMHWCCBDVSVLRRYBDYLSOHXYJXZHMWDWPDOOCSCUWJWCYIHSDPLJPXFQKOBXMNENQUPFJKJLWBETZZHJKGTBWPGHRGIPNFLXXLQWKUKVFRFK
001D: DOOCSCUWJWCYIHSDPLJPXFQKOBXMNENQUPFJKJLWBETZZHJKGTBWPGHRGIPNFLXXLQWKUKVFRFKHQITLQTRXGNRSWIGUVMPYDXNLLKNFJMUIIPKMIUJYXOISOJQVMMGGTXESCRENSNMJRQCTYCZGOOZZFQHVEUXHLGVTTSUNKNDQJR
001E: CSCUWJWCYIHSDPLJPXFQKOBXMNENQUPFJKJLWBETZZHJKGTBWPGHRGIPNFLXXLQWKUKVFRFKHQITLQTRXGNRSWIGUVMPYDXNLLKNFJMUIIPKMIUJYXOISOJQVMMGGTXESCRENSNMJRQCTYCZGOOZZFQHVEUXHLGVTTSUNKNDQJRSUQDRGYWQBVRYEUUOHVG
001F: HQITLQTRXGNRSWIGUVMPYDXNLLKNFJMUIIPKMIUJYXOISOJQVMMGGTXESCRENSNMJRQCTYCZGOOZZFQHVEUXHLGVTTSUNKNDQJRSUQDRGYWQBVRYEUUOHVGMTKTMOBVTQZYKUHKIOPWIINSPEMVGPMHWCCBDVSVLRRYBDY
0020: GOOZZFQHVEUXHLGVTTSUNKNDQJRSUQDRGYWQBVRYEUUOHVGMTKTMOBVTQZYKUHKIOPWIINSPEMVGPMHWCCBDVSVLRRYBDYLSOHXYJXZHMWDWPDOOCSCUWJWCYIHSDPLJPXFQKOBXMNENQUPFJKJLWBETZZHJKGTBWPGHR
0021: PDOOCSCUWJWCYIHSDPLJPXFQKOBXMNENQUPFJKJLWBETZZHJKGTBWPGHRGIPNFLXXLQWKUKVFRFKHQITLQTRXGNRSWIGUVMPYDXNLLKNFJMUIIPKMIUJYXOISOJQVMMGGTXESCRENSNMJRQCTYCZGOOZZFQHVEU
0022: CleanEncryptSTR
0023: send
0024: responseBody
0025: Status
0026: SaveToFile
0027: MsgBox
0028: MyString
0029: BDFHBSDFGDRFGVASDVASDRGEARGERGERG
002A: BGDFBDFDFGBVADRFGVDFGVDFBHEATRFHNGSRFHBTFBHEARBHEARBVEDRGEARHEARHEARGERFBERGERG
002B: i
002C: ASCToAdd
002D: ThisChar
002E: ThisASC
002F: NewASC
0030: BNGFSNFJNTRHEATRHETRGHTRGHERGHERHEATRHEATHERHETRHETSRHETRHH
0031: AllowedChars
0032: Asc

_VBA_PROJECT parsing done.
-------------------------------------------------------------------------------
Module streams:
_VBA_PROJECT_CUR/VBA/Module1 - 855 bytes
Line #0:
        FuncDefn (Sub book())
Line #1:
        QuoteRem 0x0000 0x0000 ""
Line #2:
        EndSub
_VBA_PROJECT_CUR/VBA/ThisWorkbook - 4396 bytes
Error: unpack_from requires a buffer of at least 4 bytes.
_VBA_PROJECT_CUR/VBA/Sheet1 - 991 bytes
_VBA_PROJECT_CUR/VBA/Sheet2 - 991 bytes
_VBA_PROJECT_CUR/VBA/Sheet3 - 991 bytes

pcodedmp can't be installed in either Ubuntu and Windows 10

in Windows 10, I use python 27 for setup.py, I encounter error:
warning: pypandoc module not found, could not convert Markdown to RST
Traceback (most recent call last):
File "f:\pcode\setup.py", line 100, in
long_description=read_md('README.md'),
File "f:\pcode\setup.py", line 21, in read_md
def read_md(f): return open(f, 'r').read()
IOError: [Errno 2] No such file or directory: 'README.md', in linux, same error i use pip install . pls see attachment
install_error.docx
, I unzip your files

Remove cycle between oletools and pcodedmp

Affected tool:
bazel

Describe the bug
There's a cycle between oletools and pcodedmp, as the maintainers would already know, and this is causing an issue when bazel tries to pull these packages using pip_parse. The difference between bazel and pip comes from the fact that pip doesn't enforce acyclic dependency graphs in dependencies as opposed to bazel that can build a DAG only.

File/Malware sample to reproduce the bug

ERROR: /private/var/tmp/_bazel_youngmokcho/994b3e899f1f2de61f63ee481ccf26ec/external/python39_deps_oletools/BUILD.bazel:22:11: in py_library rule @python39_deps_oletools//:pkg: cycle in dependency graph:
   ...
   ...
    @python39_deps_extract_msg//:pkg (a67e7319e1c7c12c19874dc7398a81096687d91bef4f7e6484f8c2d3ac4fea7f)
    @python39_deps_rtfde//:pkg (a67e7319e1c7c12c19874dc7398a81096687d91bef4f7e6484f8c2d3ac4fea7f)
.-> @python39_deps_oletools//:pkg (a67e7319e1c7c12c19874dc7398a81096687d91bef4f7e6484f8c2d3ac4fea7f)
|   @python39_deps_pcodedmp//:pkg (a67e7319e1c7c12c19874dc7398a81096687d91bef4f7e6484f8c2d3ac4fea7f)
`-- @python39_deps_oletools//:pkg (a67e7319e1c7c12c19874dc7398a81096687d91bef4f7e6484f8c2d3ac4fea7f)

How To Reproduce the bug
You can create a bazel workspace that pulls oletools==0.60.1 using pip_parse rule from rules_python.

Expected behavior
The expected behaviour is that there's no cycle in transitive dependencies of oletools including itself.

Console output / Screenshots
n/a

Version information:

  • OS: Mac x86_64 (using Rosetta2)
  • OS version: 64 bits
  • Python version: 3.9.15
  • oletools version: 0.60.1

Additional context
n/a

Cross comparison with Open Office

@bontchev I saw this last night and was amazed by the work you have done already. I noticed you still had some "known issues". I wondered are these issues also known to be issues with Open Office's VBA implementation?

Perhaps Open Office have a better version and their source code:
https://github.com/apache/openoffice/tree/273865e5126901b006a2c544dc73456b0510afee/main/oox/source/ole

could help determine the source of these issues and/or could be ported to Python...

Just a thought :)

Redundant parens in README sample code

Btw, this line in VBA

        Shell ("calc.exe")

is similar to this line in C/C++

        printf(("Hello %s"), ("World"));

with redundant parens around the string literals.

Unfortunately it's already compiled in a (redundant) Paren p-code here

        LitStr 0x0008 "calc.exe"
        Paren
        ArgsCall Shell 0x0001

Also notice that VBA IDE always puts a space between Shell and the argument so the README is faking it by manually removing the space in this line

        Shell("calc.exe")

which is obviously not a verbatim copy/paste from the VBA IDE.

Code Review

From the README file, TODO section:

I am not an experienced Python programmer and the code is ugly. Somebody more familiar with Python than me should probably rewrite the script and make it look better.

I'm no Python expert either (not even rookie), but consider posting it (in digestible, related chunks, I guess) on Code Review Stack Exchange.

Taking working code and making more efficient, more concise, more readable, more maintainable, ...well overall better - is precisely what this community does.

Awesome project BTW!

Mat
Moderator on Code Review Stack Exchange and admin of the Rubberduck project

Python 2.7 support broken

It seems like the Python 2.7 support is broken, happening on RHEL/CentOS 7:

/usr/bin/python2 setup.py build '--executable=/usr/bin/python2 -s'
/usr/lib64/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'long_description_content_type'
  warnings.warn(msg)
error in pcodedmp setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers

In case this is intended, please remove mentioning of Python 2.6+ support in README.md.

Error: unpack_from requires a buffer of at least 2 bytes.

Running pcodedump on VirusTotal sample db52f43dde8a8fff678640539011bff2882ab11d94537d84c6855c5ff1897f71
gives the following error. I can email you the sample if you don't have access to it.

Error: unpack_from requires a buffer of at least 2 bytes.
VarDefn VBA/Sheet2 - 1158 bytes

Skip win_unicode_console import for linux

I am a heavy user of oletools which now requires this module. I was wondering if you would object to wrapping all the win_unicode_console imports and requirements into a test whether the current platform actually is windows. I can create a pull request for this if you approve.

Specifically, in setup.py I would write

INSTALL_REQUIRES = ['oletools>=0.54']
if platform.system().lower() == "windows":
    INSTALL_REQUIRES.append('win_unicode_console'

and similarly in pcodedmp.py I would write:

if platform.system().lower() == "windows":
    import win_unicode_console

This would make the life of many oletools-linux-users easier

pcodedmp can't disassemble VBA code

I test pcodedmp against one Word doc VBA Malware and find it is not disassembling, attached pls find the dump, I get similar result as olevba.py, why pcodedmp fail, I attach the dump. this malware use vba to generate powershell script when word doc is opened. how can i capture powershell script cmmdlet? or other meaningful result?
malware_export.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.