Plutext

Merging Word documents in Java

This page is about programmatically merging Word documents in Java.

Merge as in concatenate/join/append, not diff/compare. For example, to place a cover letter and a contract into a single docx file, without changing the look/feel of either document.

To try our online demo, please see here

MergeDocx is part of Plutext's commercial Docx4j Enterprise Edition, but you can still use it if you are using Apache POI or Aspose.Words.

You can download a trial version of Docx4j Enterprise from our products page (including as an Eclipse project ready to run); you’ll also find pricing and ordering info there.

The Two Challenges

There are two challenges you'll face if you try to append the contents of one Word docx in another using docx4j, Poi, or Aspose:

For more on this, please see our 2010 blog post merging-word-documents.

MergeDocx takes care of all the details for you, and works equally well whether you are merging 2 documents, or 2000.

altChunk

MergeDocx can also convert an altChunk which points to alternative format input part of type docx, into first class docx content.

Without this, you'd need to open the docx in Word, then save it.

Java API

Here is a basic example of use:


    DocumentBuilderIncremental dbi = new DocumentBuilderIncremental();
 
    for (int i = 0; i < MAX; i++) {
 
        BlockRange block = getBlockRange(i);  // Your method - see below
        dbi.addBlockRange(block, i==(MAX-1) );  // 2nd param is whether this is your last docx
    }
       
    WordprocessingMLPackage output = dbi.finish();// Get the output docx

That example concatenates several entire docx documents, one after another.

But you could instead:

MergeDocx can also resolve docx altChunks into normal docx content.

What is a BlockRange?

A BlockRange is essentially a WordprocessingMLPackage, or a range of content in a WordprocessingMLPackage, plus config settings.


    BlockRange block = new BlockRange(
                                Docx4J.load(
                                      new File("yourdocx.docx")));  

A WordprocessingMLPackage is docx4j's representation of a docx file.

You can fine tune the merge process by configuring individual block ranges, or the DocumentBuilder object. For example:

For further details, please see the manual.

You can also use the webapp linked above to generate code, corresponding to your chosen configuration.