Images are commonly used to represent important information in Word DOC documents. The inclusion of images alongside text makes the content more appealing. In certain cases, you may need to extract the images embedded within the DOC documents programmatically. To achieve that, this article covers how to extract images from DOC in Java.
Java API to Extract Images from DOC Files
Aspose.Words for Java is a powerful and feature-rich API for creating, manipulating, and converting MS Word documents. Therefore, we will use this API to extract images from DOC documents. You can download the API’s JAR or install it into your Java application using the following Maven configurations.
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>21.11</version>
<type>pom</type>
</dependency>
How to Extract Images from DOC in Java
The images in a DOC document are represented using shape objects. Therefore, to retrieve images, you will have to process every shape in the document. The following are the steps to extract images from a DOC file in Java.
- First, load the DOC file using Document class.
- Then, get all the shapes into an NodeCollection object using Document.getChildNodes(NodeType.SHAPE, Boolean) method.
- Loop through the retreieved shapes.
- In each iteration, check if the shape has an image using Shape.hasImage() method.
- Finaly, extract the image and save it using Shape.getImageData().save(string) method.
The following code sample shows how to extract images from a DOC document in Java.
Java DOC Image Extractor - Get a Free License
Get a free temporary license to use Aspose.Words for Java without evaluation limitations.
Conclusion
In this article, you have learned how to extract images from a DOC document in Java. Moreover, the code sample has shown how to extract the images from a DOC file and save them to the desired location. Besides, Aspose.Words for Java provides a wide range of features for document manipulation. To explore those features, you can visit the documentation. Also, you can ask your questions via our forum.
See Also
- Create Word Documents from Scratch in Java
- Generate Word Documents from Templates in Java
- Convert Word Files to PDF in Java
Info: You may be interested in another Java API (Aspose.Slides for Java) that allows you to convert presentations (into PDFs, word documents, etc.) and import images or other documents into presentations.