In various cases, you need to split an MS Word document into multiple documents. For example, you may need to create a separate document for each page, section, or collection of pages in a Word document. In order to automate the document splitting, this article covers how to split MS Word DOCX programmatically using Java. The following sections provide a step-by-step tutorial and code samples of the above-mentioned splitting criteria.
- Java API to Split Word Documents
- Split a Word DOCX/DOC using Java
- Use Page Range to Split Word Document
- Split Word Document by Sections
- Get Free API License
Java API to Split Word DOCX
Aspose.Words for Java is a powerful and feature-rich document manipulation API that lets you create and process MS Word documents. In addition to the basic as well as advanced Word automation features, the API also allows you to split a Word document into multiple documents. You can either download the API or install it within your Maven-based application using the following configurations.
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>21.1</version>
<classifier>jdk17</classifier>
</dependency>
Word Document Splitter - Helper Class
Before you start splitting the documents, you would need to add the following helper class to your project that implements a Java document splitter based on Aspose.Words for Java. Once you have added the class, you can proceed to split the documents using the code samples provided in the sections below.
Split a Word DOCX using Java
First of all, let’s have a look at how to split an MS Word document by page. In this case, each page of the source document will be converted into a separate Word document. The following are the steps to split pages of a Word document.
- Load the Word document using Document class.
- Create an object of PageSplitter and initialize it with the Document object.
- Loop through the pages in the document.
- Use PageSplitter.getDocumentOfPage(Int pageIndex) method to retrive each page into a Document object.
- Save the document using the Document.save(String) method.
The following code sample shows how to split a Word document using Java.
Use Page Range to Split Word DOCX in Java
You can also define a page range that you want to split from the source Word document. The following are the steps to perform this operation.
- Load the Word document using Document class.
- Create an object of PageSplitter and initialize it with the Document object.
- Use PageSplitter.getDocumentOfPageRange(Int, Int) method to retrieve collection of pages into a Document object.
- Save the document using the Document.save(String) method.
The following code sample shows how to split a Word document by a page range using Java.
Split a Word Document by Sections using Java
Aspose.Words for Java also allows you to split a Word document by section breaks. The following are the steps to perform this operation.
- Load the Word document using Document class.
- Loop through each section of the document using Document.getSections() method.
- Clone section into a Section object using Document.getSections().get(index).deepClone() method.
- Create a new Document and add cloned section to the document using Document.getSections().add(Section) method.
- Save the document using Document.save(String) method.
The following code sample shows how to split a Word document by sections using Java.
Get a Free API License
You can get a free temporary license in order to try the API without evaluation limitations.
Conclusion
In this article, you have learned how to split MS Word DOCX/DOC using Java. The step by step guide and code samples have shown how to split a Word document by sections, pages, or a range of pages. You can explore more about the Java Word API using documentation.