Convert PDF Files to MS Word Documents (DOC/DOCX) in Java

PDF is one of the most commonly used formats for sending the document out to third parties. The reason behind this popularity is PDF’s compatibility across multiple platforms regardless of any hardware/software requirements. However, in some cases, you would want to convert the PDF document into an editable document format. PDF to Word DOC or DOCX could be the priority conversion option in such cases. To automate the conversion process, this article will showcase how to convert PDF to Word programmatically in Java.

So in this article, you will get to know how to:

  • Convert PDF to DOC using Java.
  • Convert PDF to DOCX using Java.
  • Convert PDF to Word (DOC/DOCX) with additional options.

API for PDF to Word Conversion in Java

Thanks to Aspose.PDF for Java – a PDF manipulation Java API that provides easy ways to convert PDF files to a variety of other formats including Word (DOC/DOCX). You can download and add API’s JAR file to your project or reference it using the following Maven configurations:

Repository

<repository>
    <id>AsposeJavaAPI</id>
    <name>Aspose Java API</name>
    <url>https://repository.aspose.com/repo/</url>
</repository>

Dependency

<dependency>
    <groupId>com.aspose</groupId>
    <artifactId>aspose-pdf</artifactId>
    <version>19.12</version>
</dependency>

Convert PDF to DOC using Java

Once you have referenced Aspose.PDF for Java in your application, you can convert any PDF document to DOC format in a couple of lines of code. The following are the steps required to perform this conversion.

The following code sample shows how to convert PDF to DOC in Java.

Input PDF Document

How to Convert PDF to Word in Java

Output Word Document

Convert PDF to DOC in Java

Convert PDF to DOCX using Java

DOCX is a well-known format for Word documents and in contrast to the DOC format, the structure of DOCX was based on the binary as well as the XML files. In case you want to convert PDF to DOCX format, you can tell the API to do so using the SaveFormat.DocX argument in Document.save() method.

The following code sample shows how to convert PDF to DOCX in Java.

Additional Options for PDF to Word Conversion

Aspose.PDF for Java also provides some additional options that you can use in PDF to Word conversion, such as the output format, image resolution, distance between text lines and so on. DocSaveOptions class is used for this purpose and the following is the list of options you can use:

The following code sample shows how to use DocSaveOptions class in PDF to DOCX conversion using Java.

Related Article(s)