Working with various document formats is a common requirement in software development, and converting PDF files to Word documents is a task many developers encounter. In this blog post, we will explore how to convert PDF files to Word documents in a Java application. Also, we will cover how to customize the PDF to Word conversion with different options.
- Java PDF to Word DOC Converter Library
- Convert PDF to DOC in Java
- Convert PDF to DOCX in Java
- Customize PDF to Word (DOC/DOCX) conversion
Java Library to Convert PDF to Word DOC
Aspose.PDF for Java is a class library that enables developers to work with PDF documents programmatically. It provides a wide range of features for creating, manipulating, and converting PDF documents. You can download and add API’s JAR file to your project or reference it using the following Maven configurations:
Repository:
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
Dependency:
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-pdf</artifactId>
<version>23.9</version>
</dependency>
Convert a PDF to Word DOC in Java
Once you have referenced Aspose.PDF for Java in your application, you can convert any PDF document to DOC format in a couple of lines of code. The following are the steps required to perform this conversion.
- Create an instance of the Document class and initialize it with the input PDF file’s path.
- Call Document.save() method with the output DOC file’s name and SaveFormat.Doc arguments.
The following code sample shows how to convert PDF to DOC in Java.
Input PDF Document
Output Word Document
Convert PDF to DOCX in Java
DOCX is a well-known format for Word documents and in contrast to the DOC format, the structure of DOCX was based on the binary as well as the XML files. In case you want to convert PDF to DOCX format, you can tell the API to do so using the SaveFormat.DocX argument in Document.save() method.
The following code sample shows how to convert PDF to DOCX in Java.
Customize PDF to Word Conversion
Aspose.PDF for Java also provides some additional options that you can use in PDF to Word conversion, such as the output format, image resolution, distance between text lines, and so on. DocSaveOptions class is used for this purpose and the following is the list of options you can use:
- setFormat(int value) - To set the output format (Doc, Docx, etc.).
- setAddReturnToLineEnd(boolean value) - To add the paragraph or line breaks.
- setImageResolutionX(int value) - To set the X resolution for the images.
- setImageResolutionY(int value) - To set the Y resolution for the images.
- setMaxDistanceBetweenTextLines(float value) - To group text lines into paragraphs.
- setMode(int value) - To set recognition mode.
- setRecognizeBullets(boolean value) - To switch the recognition of bullets on.
- setRelativeHorizontalProximity(float value) - To set the width of space between different text elements in the input PDF file.
The following code sample shows how to use DocSaveOptions class in PDF to DOCX conversion using Java.
Get a Free License
You can get a free temporary license to convert PDF to Word format without evaluation limitations.
Conclusion
Converting PDF files to Word documents in Java is a simple task using Aspose.PDF for Java. The library provides a powerful and versatile solution for document conversion tasks. By following the steps outlined in this blog post, you can easily integrate the library into your Java projects and perform PDF to Word conversions with ease. You can learn more about converting PDF to other formats from the documentation.