Coder Social home page Coder Social logo

pstconv's Introduction

pstconv

A java command line tool to convert proprietary Microsoft Outlook OST/PST files to EML or MBOX format, even if the file is password protected. OST/PST content is parsed and extracted with java-libpst library.

Requirements

  • Java Runtime Environment 8

Usage

usage: java -jar pstconv.jar [OPTIONS]
 -e,--encoding <ENCODING>   Encoding to use for reading character data.
                            Default is ISO-8859-1.
 -f,--format <FORMAT>       Convert input file to one of the following
                            formats: mbox, eml. Default is mbox.
 -h,--help                  Print help and exit.
 -i,--input <FILE>          Path to OST/PST input file. Required option.
 -o,--output <DIRECTORY>    Path to MBOX/EML output directory. If it
                            doesn't exist, the application will attempt to
                            create it. Required option.
 -v,--version               Print version and exit.

For example, the following command will convert File01.pst to MBOX format, saving the results to a directory named 'mailbox':

$ java -jar pstconv.jar -i File01.pst -o mailbox

After the conversion is finished, you can use a free software like Mozilla Thunderbird in combination with ImportExportTools NG add-on to import the 'mailbox' directory to the e-mail client mailbox and view the converted messages. According to ImportExportTools NG plugin documentation, the import should work with either MBOX or EML formats, but we only tested it with MBOX:

  1. Right-click on the 'Local Folders' entry to popup the context menu and select the 'Import mbox file' option as shown in the following image;

ImportExportTools context menu

  1. On the 'MBox Import dialog', select the option 'Import one or more mbox files, with its/their subdirectory'. Hit the 'OK' button and a open file dialog is shown to allow to select the mbox file. To import all messages from the converted "mailbox" directory, select the file 'Top of Personal Folders' located on the root of that directory. Click on the 'Open' button to start importing the 'mailbox' directory (it may take a while depending on the number and size of the messages);

ImportExportTools import dialog

  1. After the import is finished, a new directory named 'Top of Personal Folders' is added to the 'Local Folders' entry containing all messages from the 'mailbox' directory.

Messages added to Local Folders entry

The tool also adds a custom header named 'X-Outlook-Descriptor-Id' to each converted message containing the value of the descriptor id from the original PST message, so that's possible to compare both messages if needed.

How it works

The following flowchart diagram tries to explain the sequence of steps taken by pstconv tool to convert the input PST file.

pstconv flowchart

Performance

We have randomly selected 12 PST files from real forensic cases with sizes ranging from 200MB to more than 3GB. We ran the pstconv tool 20 times to convert each selected file to MBOX and EML format (10 times each). The average results are shown in the following tables.

MBOX

File Name Size (MB) Msg Count Time (sec) Msgs/sec
File01.pst 181 743 6.9 107.3
File02.pst 299 1126 11.6 96.8
File03.pst 554 978 19.3 50.7
File04.pst 632 1198 21.2 56.4
File05.pst 770 1388 21.0 66.1
File06.pst 1033 3045 34.8 87.6
File07.pst 1162 4393 36.1 121.6
File08.pst 1365 3122 41.7 74.8
File09.pst 1849 3432 59.7 57.5
File10.pst 1979 10460 68.1 153.6
File11.pst 2771 2745 89.6 30.7
File12.pst 3477 3451 115.0 30.0
Total 16072 36081 525.0 77.8 (avg)

EML

File Name Size (MB) Msg Count Time (sec) Msgs/sec
File01.pst 181 743 42.5 17.5
File02.pst 299 1126 67.9 16.6
File03.pst 554 978 134.7 7.3
File04.pst 632 1198 139.6 8.6
File05.pst 770 1388 151.5 9.2
File06.pst 1033 3045 224.8 13.6
File07.pst 1162 4393 238.6 18.4
File08.pst 1365 3122 274.9 11.4
File09.pst 1849 3432 392.0 8.8
File10.pst 1979 10460 424.2 24.7
File11.pst 2771 2745 611.0 4.5
File12.pst 3477 3451 762.4 4.5
Total 16072 36081 3464.2 12.1 (avg)

The performance results show that converting a PST file to MBOX format is about 6.5 times faster than converting to EML format.

Building

To build this project you need:

  • Java Development Kit 8
  • Apache Maven 3.6.x

Assuming all the tools can be found on the PATH, simply go to the project directory and run the following command:

$ mvn -B package

Releasing

Go to the project directory and run the following commands:

$ mvn -B release:prepare
$ mvn -B release:perform -Darguments='-Dmaven.deploy.skip=true' 

It will automatically assume the defaults for each required parameter, namely, releaseVersion and developmentVersion. If it's necessary to control the values of each version, the release:prepare command can be run as follows:

$ mvn -B release:prepare -DreleaseVersion={a release version} -DdevelopmentVersion={next version}-SNAPSHOT

pstconv's People

Contributors

cjmach avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pstconv's Issues

mail format

Do you know what format use Microsoft to store its mails in OST/PST files? HTML/EML/MESSAGE/etc.

Converted messages show body twice: in plain text and in HTML

Converter is adding both HTML and plain text content to the message body. When viewing the message on a client (e.g. Thunderbird), the content is shown twice, first the plain text, then the html version.

The fix is to add only one of the contents. HTML is preferable.

Error trying to export to mailbox when pst subfolders include invalid characters

Using the latest version (v0.9.6) I get this error:

Jul 10, 2023 12:35:18 PM pt.cjmach.pstconv.PstConverter error
SCHWERWIEGEND: Failed to convert PSTFile object for file java.io.RandomAccessFile@159a0f0e. Unable to create folder [C:\temp\mailbox\Oberste Ebene der Outlook-Datendatei.sbd\ABCD \ EFGH]

In the pst-archive there is an folder named "ABCD / EFGH".

Maybe those characters (like "\" or "/") should be replaced by "_" or something before creating directories.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.