In various cases, you may need to find and replace a particular piece of text in the PDF documents. However, searching and updating each occurrence manually may cost you extra time and effort. For such cases, the find and replace option makes your life easier. In this article, you will learn how to find and replace text in PDF documents using Java.
- Java Library to Find and Replace Text in PDF
- Find and Replace Text in PDF using Java
- Replace Text on a Particular Page in PDF
- Replace Text using Regular Expression
Java Library to Find and Replace Text in PDF
To find and replace text in PDF, we will use Aspose.PDF for Java. It is designed for generating and manipulating PDF files from within Java applications. The library provides a wide range of basic as well as advanced PDF manipulation features including finding and replacing text.
You can either download the library or install it using the following Maven configurations.
<repository>
<id>AsposeJavaAPI</id>
<name>Aspose Java API</name>
<url>https://repository.aspose.com/repo/</url>
</repository>
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-pdf</artifactId>
<version>22.12</version>
</dependency>
Find and Replace Text in PDF using Java
To replace a particular text in PDF, you would first get all the text fragments matching the search string. Once you have them, simply replace each fragment with updated text one by one.
The following are the steps to find and replace text in a PDF file using Java.
- Load the PDF file using Document class.
- Create an object of TextFragmentAbsorber class and initialize it with the text you want to find and replace.
- Accept the absorber for the pages in PDF using Document.getPages().accept(TextFragmentAbsorber) method.
- Get all the occurrences of the text returned by TextFragmentAbsorber.getTextFragments() into a TextFragmentCollection object.
- Loop through each TextFragment in the TextFragmentCollection object and replace the text using TextFragment.setText(String) method.
- Save the updated PDF file using Document.save(String) method.
The following code sample shows how to find and replace text in PDF.
Search and Replace Text on a Particular Page in PDF
Instead of finding and replacing text in the whole PDF, you can specify a single page on which you want to replace the text occurrences. In this case, you will accept the TextFragmentAbsorber for a particular page only by specifying the page index.
The following are the steps to search and replace text on a particular page in PDF in Java.
- Load the PDF file using Document class.
- Create an object of TextFragmentAbsorber class and initialize it with the text you want to find and replace.
- Accept the absorber for a particular page in PDF using Document.getPages().get_Item(Int pageIndex).accept(TextFragmentAbsorber) method.
- Get all the occurrences of the text returned by TextFragmentAbsorber.getTextFragments() into TextFragmentCollection object.
- Loop through each TextFragment in the TextFragmentCollection object and replace the text using TextFragment.setText(String) method.
- Save the updated PDF file using Document.save(String) method.
The following code sample shows how to find and replace text on a particular page in PDF using Java.
Java Find and Replace Text in PDF using Regex
You can also specify a regular expression to search the text that matches a particular pattern such as emails, SSNs, etc. The following are the steps to define and use a regular expression to search and replace text in PDF using Java.
- Load the PDF file using Document class.
- Create an object of TextFragmentAbsorber class and initialize it with the regular expression you want to use.
- Create an object of TextSearchOptions class and initialize it with true to enable the regular expression-based search.
- Set options using TextFragmentAbsorber.setTextSearchOptions(TextSearchOptions) method.
- Accept the absorber for the pages in PDF using Document.getPages().accept(TextFragmentAbsorber) method.
- Get all the found occurrences of the text returned by TextFragmentAbsorber.getTextFragments() into TextFragmentCollection object.
- Loop through each TextFragment in the TextFragmentCollection object and replace the text using TextFragment.setText(String) method.
- Save the updated PDF file using Document.save(String) method.
The following code sample shows how to find and replace text in PDF using regular expression.
Free Java Library to Replace Text in PDF
You can get a free temporary license and find and replace text in PDF without any limitations.
Explore Java PDF Library
You can explore more about the Java PDF library using documentation.
Conclusion
In this article, you have learned how to find and replace text in PDF using Java. Furthermore, you have seen how to use a regular expression to search and replace text following a particular pattern.