Nayyer Shahbaz August 6, 2015one Comment

Convert PDF file to PDF/A_3 (3a and 3b), RGB colorspace to GrayScale conversion and the Power to manipulate tables in existing PDF with Aspose.Pdf for .NET 10.6.0

Convert PDF file to PDF/A_3 (3a and 3b), RGB colorspace to GrayScale conversion and the Power to manipulate tables in existing PDF with Aspose.Pdf for .NET 10.6.0

August 6, 2015
Share on FacebookTweet about this on TwitterShare on LinkedIn

Aspose.Pdf for .NET logoEmpowering the API with new rich features and enhancements, a new release of Aspose.Pdf for .NET 10.6.0 has been published. This version contains some amazing new features which enrich the API to create stunning applications with vast variety of PDF creation as well as manipulation features. Astonish your customers through your applications by providing stunningly amazing features for PDF file creation/manipulation and surprise them with resultant files with great fidelity. The ease of use, extensive documentation and free technical support are some of the salient features of our API’s and we always strive to meet our customer’s expectations because we believe customer satisfaction is our Quality . In every new release, we closely analyze our customers requirements and focus even towards minor details, so that we come up with features which produce remarkable outputs and bring ease to their life by eliminating the hassle of writing huge lines of code. All this can be accomplished using a single API instead of numerous components/softwares. Indeed we have taken the responsibility of harder parts and provide you the API’s which provide out of the box features and have incredible capabilities to generate the output with even couple of code lines. Like always, the new release is also empowered with some new features and enhancements.

PDF to PDF/A-3 with compliance-level (3a, 3b)

PDF to PDF/A conversion and PDF/A compliance validation features have been supported by our API for quite sometime and from time to time, we introduce modifications, so that new enhancements are provided in these functionalities. The following code lines can help in converting PDF file to PDF/A_3b compliant format.

string inFile = "input.pdf";
string outFile = "output.pdf";
Document doc = new Document(inFile);
doc.Convert(new MemoryStream(), PdfFormat.PDF_A_3B, ConvertErrorAction.Delete);

Converting a PDF from RGB colorspace to Grayscale

We received a requirement to to convert a PDF from RGB colorspace to Grayscale, so that it would be faster while printing those PDF files. Also when file is converted to GrayScale, the size of document is also reduced but with this change, the quality of document may drop. Currently this feature is supported by Pre-Flight feature of Adobe Acrobat, but when talking about Office automation, Aspose.Pdf is an ultimate solution to provide such leverages for document manipulation. In order to accomplish this requirement, following code snippet can be used.

// load source PDF file
using (Document document = new Document(@"c:\\pdftest\\candy.pdf"))
    Aspose.Pdf.Engine.Presentation.RgbToDeviceGrayConversionStrategy strategy = new Aspose.Pdf.Engine.Presentation.RgbToDeviceGrayConversionStrategy();
    for (int idxPage = 1; idxPage <= document.Pages.Count; idxPage++)
        // get instance of particular page inside PDF
        Page page = document.Pages[idxPage];
        // convert the RGB colorspace image to GrayScale colorspace
    // save resultant file

Manipulate tables in existing PDF document

One of the earliest features supported by Aspose.Pdf for .NET is its capabilities of Working with Tables and it provides great support for adding tables in PDF files being generated from scratch or any existing PDF files. You also get the capability to Integrate Table with Database (DOM) to create dynamic tables based on database contents. In this new release, we have implemented new feature of searching and parsing simple tables that already exist on page of PDF document. A new class named Aspose.Pdf.Text.TableAbsorber is provides these capabilities. The usage of TableAbsorber is very much similar to existing TextFragmentAbsorber class. The following code snippet shows the steps to update contents in particular table cell.

// load existing PDF file
Document pdfDocument = new Document(inFile);
// Create TableAbsorber object to find tables
TableAbsorber absorber = new TableAbsorber();

// Visit first page with absorber

// Get access to first table on page, their first cell and text fragments in it
TextFragment fragment = absorber.TableList[0].RowList[0].CellList[0].TextFragments[1];

// Change text of the first text fragment in the cell
fragment.Text = "hi world";


Features related to this functionality which still need implementation.

  • One of the customers has requested to fetch the data based on the blocks of table or borders (as given in the diagram) and colors as well. Currently TableAbsorber cannot recognize table cell background color now. However we expect to make this improvement in this future and a separate ticket PDFNEWNET-38997 is already created in our issue tracking system.
  • Another customer wants to get contents of column in the table. Currently TableAbsorber cannot recognize table without borders, but conversion to XLS works well in such cases. However conversion to XLS is a workaround. However we created a ticket PDFNEWNET-38998 to improve TableAbsorber for working with such table types.
  • Customer wants to update table in existing PDF dynamically. Including deleting / insertion of rows. This request is a bit difficult to implement and current implementation of TableAbsorber cannot fulfill such requirements. However in order to cater such requirements, we have created a ticket PDFNEWNET-38999 to investigate the request.
  • If you have a requirement of looking for text property in Aspose.Pdf.Cells or BaseParagraph types, (such types are designed for adding new contents on the page), you must cast BaseParagraph to one of the inherited types. For example next code must help:
    foreach (Row r in table.Rows)
        TextFragment fragment = r.Cells[1].Paragraphs[1] as TextFragment;
        string text;
        if (fragment != null)
            text = fragment.Text;

XML to PDF: Working with individual objects in new generator

The XmlLoadOptions class provides the feature to load XML document and parse it to PDF format. However recently we received a requirement to work with individual objects (as supported in legacy Aspose.Pdf.Generator) when using this approach.

Document doc = new Document();
doc.Pages[1].Paragraphs[0].IsInNewPage = true;

Clear() method in PageCollection in new generator

When using legacy Aspsoe.Pdf.Generator approach, we get the option to create a separate list which can be used to save the pages already created. After creating multiple pages, we usually clears the PDF and then again add the pages from own list. Therefore in order to introduce the same capabilities in new Document Object Model, following code snippet can be used.

string outFile = "38613.pdf";
Document doc = new Document();
Page page = doc.Pages.Add();
page.Paragraphs.Add(new TextFragment("text"));
page.Paragraphs.Add(new TextFragment("text1"));
Assert.IsTrue(doc.Pages.Count == 0);
Assert.IsTrue(doc.Pages.Count == 1);

Miscellaneous fixes

As well as the enhancements and features discussed above, there have been specific improvement for PDF to HTML and HTML to PDF conversion features with better support for HTML5. Among these fixes, the PCL to PDF, SVG to PDF, PDF to Excel, PDF to DOC, PDF to TIFF and TIFF to PDF conversion, conversion of PDF to PDF/A compliant documents, text replacement, Filling of signature field with an image, flattening of PDF and rendering of PDF to XPS format, FloatingBox rendering, FootNote, EndNote and rendering of non-English (specifically Arabic) contents are also improved. Please download and try the latest release of Aspose.Pdf for .NET 10.6.0.

Join the Conversation

1 Comment

Leave a comment

Posted inAspose.PDF Product Family, Nayyer Shahbaz

Related Articles