Nayyer Shahbaz December 11, 2014one Comment

PDF to DOCX or Single HTML with All Resources Embedded, Optimized TIFF to PDF, PDF to Image Conversion with Efficient Memory Use in Aspose.Pdf for Java 9.7.0

PDF to DOCX or Single HTML with All Resources Embedded, Optimized TIFF to PDF, PDF to Image Conversion with Efficient Memory Use in Aspose.Pdf for Java 9.7.0

December 11, 2014
Share on FacebookTweet about this on TwitterShare on LinkedIn

Aspose.Pdf for Java logoIt gives us immense pleasure to announce the release of of Aspose.Pdf for Java 9.7.0 which offers some great new features. Document manipulation and conversion from various formats to PDF, and conversion of PDF to other file formats have always been our main development areas. Keeping this tradition alive, this new release provides couple of new features as well as improvements in the API for complex scenarios handling for image to PDF, PDF to image, text extraction, watermark manipulation and much more.

PDF to DOCX Conversion

Aspose.Pdf for Java already supports rendering PDF files to Microsoft Word (DOC) format. The DocSaveOptions class makes it possible to render a PDF file to Microsoft Word format. This class also provides numerous properties that improve the process of converting PDF files to DOC format. Among these properties, Mode enables you to specify a recognition mode for PDF content. You can specify any value from the RecognitionMode enumeration for this property. Each of these values have specific benefits and limitations. However starting this new release, Aspose.Pdf for Java also offers the capabilities to convert PDF files to DOCX format. For further information, please visit Convert PDF to DOC or DOCX Format.

// Load source PDF file
com.aspose.pdf.Document doc = new com.aspose.pdf.Document("c:/source.pdf");
// Instantiate Doc SaveOptions instance
DocSaveOptions saveOptions = new DocSaveOptions();
// Set output file format as DOCX
// Save output DOCX file"c:/resultant.docx", saveOptions);

Convert PDF to HTML with All Resources Embedded

HTML to PDF and PDF to HTML are some of the features our customers use most. During conversion, all resources (fonts, images and CSS) from a PDF file are saved in a separate folder in the same directory as the output HTML. However, we were asked to provide a feature for converting a PDF file to HTML format and generate a single HTML file with all resources embedded. The current release of Aspose.Pdf for Java offers this feature. For further details, please visit PDF to HTML – Single HTML with all Resources Embedded.

// Load source PDF file
com.aspose.pdf.Document doc = new com.aspose.pdf.Document("c:/input.pdf");
// Instantiate HTML Save options object
HtmlSaveOptions newOptions = new HtmlSaveOptions();

// Enable option to embed all resources inside the HTML
newOptions.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;

// This is just optimization for IE and can be omitted 
newOptions.LettersPositioningMethod = LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
newOptions.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
newOptions.FontSavingMode = HtmlSaveOptions.FontSavingModes.SaveInAllFormats;
// Output file path
String outHtmlFile = "c:/Single_output.html";
// Save the output file, newOptions);

Miscellaneous Fixes

As well as the enhancements and features discussed above, there have been numerous fixes related to HTML to PDF conversion, PDF to Excel conversion, XPS to PDF conversion, PDF to TIFF conversion, text replacement, text extraction, rendering PDF files to XPS, creating TOCs in PDF files, and printing PDFs with embedded fonts. Please download and try the latest Aspose.Pdf for .NET 9.7.0 release.

Join the Conversation

1 Comment

Leave a comment

Posted inAspose.PDF Product Family, Nayyer Shahbaz

Related Articles