About C2S

C2S company logo

C2S provide document processing solution for accounts payable and digital mailroom.

At the core of our solution are two systems - ABBYY OCR & FileDirector Document Management. Both systems process and handle documents that are images like PDF or Tiff perfectly. ABBYY OCR cannot recognize existing text documents like Excel or Word and FileDirector will only allow annotations of scanned documents like Tiff.

Problem

C2S has always struggled to find a consistent solution for document handling that can carefully and accurately convert documents. We then came across the Aspose.Total for .NET.

Solution

Having previously used PDF# and other open-source libraries, we were more often than not left disappointed by the performance. Our clients required bulletproof processing and accuracy.

During the development of our Santiago tool, we had a specific customer requirement that made us search and find a perfect solution. Santiago is a scheduler that monitors email accounts and downloads attachments that meet set rules. The attachments are then forwarded into a workflow. Santiago effectively feeds a digital mailroom or accounts payable solution.

The customer in question received attachments from over 500 branches around London. The attachments were agendas and minutes of meetings and were in any format including Word or Excel. We needed to forward these into our FileDirector solution, fully indexed into the correct queue but the customer asked to be able to annotate the documents – add text, arrows, highlights, etc. This appeared to be only possible if we converted the documents into Tiff.

Therefore, we have now uses Aspose.Total. for NET library into Santiago with excellent results.

Experience

The Santiago email downloader solution has been designed to be completely modular and therefore extremely flexible. The system is designed to download emails, extract information, and save it as required. It is capable of saving not just the attachments of an email, but also the body (both HTML and plain text) and the metadata (sender address, recipient addresses, subject, date, etc) and we save this in a range of formats.

The primary requirement that led us to Aspose.Total for .NET was the need to convert documents (mainly MS Word and Excel formats) into TIFF. We experimented with several different libraries but Aspose.Total for .NET was by far the best solution.

Santiago operates by downloading emails as .eml files into a user-designated folder, from this folder the Data Extraction module picks up the .eml file, extracts the required documents, and saves this to a further designated folder. Finally, the File2Tiff module picks up the extracted documents, performs the conversion, and again saves them to the final location where they can be collected by the FileDirector system for archiving.

Preview of The workflow for this process

Image 1:- The workflow for this process

The File2Tiff module examines the extension of the file to be converted and selects the required conversion system:

File2Tiff is then able to utilize the Aspose.Words for .NET and Aspose.Cells for .NET libraries to perform the conversion, making use of the Document and Workbook classes respectively, using only a few lines of code:

CodSnippet to determine the file extension

Image 2:- Code snippet to determine the file extension

Aspose.Words for .NET

Conversion of MS Word files to other formats

Image 3:- Code snippet for the conversion of MS Word files to other formats

Aspose.Cells for .NET

Excel to Image conversion code snippet

Image 4:- Conversion of Excel files to Image format

A particularly useful feature of both libraries is the ability to specify image saving options. This allows us to provide the end-user with the flexibility to specify the resolution at which they wish to save the document, selecting between saving in color or greyscale and choosing particular pages of a document to convert:

Code snippet to convert DOCX to TIFF with customer Resolution

Image 5:- Setting resolution options while converting DOC to TIFF format

Summary

The Aspose.Total for .NET package offers significantly more functionality than we have implemented so far. We will continue to enhance the Santiago system further to harness the power of this product, as well as re-visiting the existing modules to ascertain whether they could benefit from the addition of Aspose.