Coder Social home page Coder Social logo

centic9 / poi-mail-merge Goto Github PK

View Code? Open in Web Editor NEW
33.0 8.0 23.0 519 KB

Small application which allows to repeatedely replace markers in a Microsoft Word document with items taken from a CSV/Microsoft Excel file to provide a simple mail-merge functionality

License: BSD 2-Clause "Simplified" License

Shell 0.30% Java 99.70%

poi-mail-merge's Introduction

Build Status Gradle Status

This is a small application which allows to repeatedly replace markers in a Microsoft Word document with items taken from a CSV/Microsoft Excel file.

I started this project as I was quite disappointed with the functionality that LibreOffice offers, I especially wanted something that is repeatable/automatable and does not produce spurious strange results and also does not need re-configuration each time the mail-merge is (re-)run.

How it works

All you need is a Word-Document in "docx" format (>= 2003) which acts as template and an Excel .xls/.xlsx or CSV file which contains one row for each time the template-document should be produced.

The word-document can contain template-markers (enclosed in ${...}) for things that should be replaced, e.g. "${first-name} ${last-name}".

The first row of the first sheet of the Excel/CSV file is read as a header-row which is used to match the template-names used in the Word-template.

Only the first sheet of Excel files are read.

The result is a single merged Word-document which contains a replaced copy of the template for each line in the Excel file.

Use it

Grab and build it

git clone https://github.com/centic9/poi-mail-merge.git
cd poi-mail-merge
./gradlew installDist

Run it

./run.sh <word-template> <excel/csv-file> <output-file>

Sample files

There are some sample files in the directory samples, you can run these as follows

./gradlew installDist
build\install\poi-mail-merge\bin\poi-mail-merge.bat samples\Template.docx samples\Lines.xlsx build\Result.docx

on Unix you can use the following steps

./gradlew installDist
./run.sh samples/Template.docx samples/Lines.xlsx build/Result.docx

Support this project

If you find this tool useful and would like to support work on it, you can Sponsor the author

Tips

Convert to PDF

You can use the tool unoconv from OpenOffice/LibreOffice to further convert the resulting docx, e.g. to PDF:

unoconv -vvv --timeout=60 --doctype=document --output=result.pdf result.docx

Known issues

Only XLS/XLSX and one CSV format supported

For XLS/XLSX files only the first sheet is read and headers are expected to be in the first row with data starting in the second row.

For CSV, currently only files which use comma as delimiter and double-quotes for quoting text are supported. Other formats require code-changes, but should be easy to do by adjusting the CSFFormat definition (this project uses Apache Commons CSV for CSV handling).

Only DOCX template format supported

The older .doc format is not supported as template document because this project makes heavy use of the internal XML format of DOCX files.

High memory usage for large resulting files

The resulting output file is fully held in memory, so a very large number of merged documents may cause very high memory usage and/or out-of-memory errors.

A streaming writing is currently not easy to support, but it should be possible to add a mode of operation which writes separate files for the merged documents to overcome this limitation if necessary. Pull-requests highly welcome!

Word-Formatting can confuse the replacement

If there are multiple formattings applied to a strings that holds a template-pattern, (e.g. if you make only half of the template-variable bold), the resulting XML-representation of the document might be split into multiple XML-Tags and thus might prevent the replacement from happening.

A workaround is to use the formatting tool in LibreOffice/OpenOffice to ensure that the replacement tags have only one formatting applied to them.

See #6 for possible improvements.

Change it

Create Eclipse project files

./gradlew eclipse

Build it and run tests

cd poi-mail-merge
./gradlew check jacocoTestReport

Licensing

poi-mail-merge's People

Contributors

centic9 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

poi-mail-merge's Issues

Detect cases when replacement tags are split due to different formatting

Currently, if the replacement-tags span multiple formats, replacement is simply not done, because we look at a run at a time. There are multiple options to handle this better:

  • detect incorrectly formatted tags and fail gereation
  • detect an unclosed tag and include text from the next run in this case

Page size

Hi Dominik,

Is there any way to set the page size for the document? Because, the generated document has the merged documents mixed up with the previous one. I have attached the screenshot for this.

Thanks,
Vamshi

merge_issue

dd/mm/YYYY merged to mm/dd/YY in resulting .docx

Hi guys, awesome work - very useful tool.

The one issue I have is that I am merging dates which are in Australian format (i.e. dd/mm/YYYY) from my .xlsx spreadsheet and they are somehow being transformed to US format (i.e. mm/dd/YYYY) in the 'filled' .docx. I could care less about the YYYY vs. YY output, but the inversed day/month characters are a bit of an issue!

Any ideas how I could fix this? I have ensured that both input and output merge fields (in Excel/Word) are formatted to dd/mm/YYYY and that my system (Debian 9) is set to English AU keyboard/time/geo options but I ran out of idea there.

Any help greatly appreciated!

Example documents

Could you please provide example documents to work with?

Thanks,
Vamshi

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.