Get Warning Against Font Substitution, EMFDevice Implementation, Escape HTML Tags and Special Characters during Conversion and Default Font Usage with Aspose.Pdf for Java 11.5.0

Share on FacebookTweet about this on TwitterShare on LinkedIn

Aspose.Pdf for .NET logoA new release of Aspose.Pdf for Java has been published with new and exciting features which makes PDF files creation, as well as manipulation much convenient. We have always strive to provide simple and robust features in our API which help our customers to achieve complex requirements with simple code snippets. Our API’s have uncompromising capabilities for documents manipulation / generation and these startling features put incredible power that leaps past most API’s in market. It makes even complex work as easy as two lines of code and users can perform inter file format conversion with simple code snippets. The key to our API’s experience is their simple and workable approach, which helps in creating incredible applications with fewer lines of code. Likewise, we have provided some greatly demanded features in this release as well as fixes for issues reported in earlier release versions.

Get warnings for font substitution

One of the customers recently had a requirement to show warning when fonts are substituted. This is a useful feature in tests, as it allows to fail the tests faster if the font is missing, rather than having to track down why the result looks incorrect. In order to cater this requirement, please try using following code snippet where Document class gets notifications about font substitutions.

// Load existing PDf file 
Document pdfDoc = new Document(inFile);

final Map names =  new HashMap() ;
pdfDoc.FontSubstitution.add(new Document.FontSubstitutionHandler()
    public void invoke(Font font, Font newFont)
        //add substituted FontNames into map.  
        names.put(font.getFontName(), newFont.getFontName());
        //or print the message into console
        System.out.println("Warning: Font "+ font.getFontName() + " was substituted with another font -> " + newFont.getFontName());
// instantiate HTMLSave option to save output in HTML
HtmlSaveOptions htmlSaveOps = new HtmlSaveOptions();
// save resultant file"output.html", htmlSaveOps);

EmfDevice implementation

We have implemented EmfDevice class to convert PDF files to EMF format and in order to accomplish this requirement, please try using following code snippet.

// instantiate EmfDevice object
EmfDevice device = new EmfDevice(new Resolution(96));
// load existing PDF file
Document doc = new Document("Input.pdf");
// save first page of PDF file as Emf image
device.process(doc.getPages().get_Item(1), "output.emf");

HTML to PDF – escape HTML tags and special characters

One of the customers recently shared a requirement to have some any built-in feature which can be used to escape the HTML tags and unescape special characters when writing those in PDF. Also ignoring any kind of text formatting (e.g. escape < /li > with “\r\n” etc.)

// input HTML 
String HTML = "< b >BIG TEXT< /b>< ol>SOME VALUE< /ol>< li >item1< /li >< li >item2 & 3 < /li >< /ol >";
// CSS for input HTML contents
String CSS = " ✱ {font-weight : normal !important ; margin :0 !important ; padding:0 !important ; list-style-type:none !important}";
// instantiate Document instance
Document doc = new Document();
// add page to pages collection of Document object
Page page = doc.getPages().add();
// add HTMLFragment to paragraphs collection of PDF page
page.getParagraphs().add(new com.aspose.pdf.HtmlFragment(CSS + HTML));
// save resultant PDF file"output.pdf");

Default font when specific font from document is missing

When transforming PDF files to DOC or HTML format, the fonts used inside PDF file are used in resultant files, so that formatting of document is preserved. However recently some customers reported that they are facing Font not found issue when converting PDF files to DOC or HTML format on non-Windows environment. As a workaround, user have to install respective font on his system but sometimes user is not certain about the font to be used. Therefore a request to have a feature of using default font if specific fonts used inside document are missing (not installed over system). So instead of throwing an exception, they are interested in having a feature to use one of the default fonts i.e. Arial when performing conversion and as may consider showing a message in Console that default font was used because ABC font was found missing. However, notice that we do not guarantee that the substituted font will correctly shows all the characters. Therefore you should find yourself the font that will be compatible with the absent original font. Also, we implemented the ability to get notification when the font is substituted.

Document pdf = new Document(myDir+"Redis.pdf");

//configure font substitution
CustomSubst1 subst1 = new CustomSubst1();
//Configure notifier to console
pdf.FontSubstitution.add(new Document.FontSubstitutionHandler()
    public void invoke(Font font, Font newFont)
        //print substituted FontNames into console
        System.out.println("Warning: Font "+ font.getFontName() + " was substituted with another font -> " + newFont.getFontName());
HtmlSaveOptions htmlSaveOps = new HtmlSaveOptions();"Redis_1150_substitutedWithMSGothic_release.html", htmlSaveOps);

* The class to implement font substitution
private static class CustomSubst1 extends CustomFontSubstitutionBase
    public boolean trySubstitute(OriginalFontSpecification originalFontSpecification, /*out*/ com.aspose.pdf.Font[] substitutionFont)
        //1. substitute Arial font with TimesNewRoman font
//            if ("Arial".equals(originalFontSpecification.getOriginalFontName()))
//            {
//                substitutionFont[0] = FontRepository.findFont("TimesNewRoman");
//                return true;
//            }            
//            else
//                return super.trySubstitute(originalFontSpecification, /*out*/ substitutionFont);
        //2. or substitute all the fonts with the MSGothic font            
        substitutionFont[0] = FontRepository.findFont("MSGothic");
        return true;            

Miscellaneous fixes

As well as the enhancements and features discussed above, there have been specific improvements regarding PDF to HTML, PDF to PDF/A, HTML to PDF, Epub to PDF, Conversion of Non-Searchable PDF to Searchable PDF file, Text extraction from PDF and image placement inside PDF are also improved. Please download and try the latest release of Aspose.Pdf for Java 11.5.0.