Scanned PDF to Excel OCR

Scanned PDF files contain data in image format and sometimes you may need information from such documents. In certain situations, a scanned PDF file can contain numeric information which may need to be manipulated in Excel. In accordance with that, you can perform OCR operations and create an Excel file. This article covers how to create a scanned PDF to Excel converter with OCR feature programmatically using C#.

Create Scanned PDF to Excel Converter with OCR – C# API Installation

You can work with OCR features offered by the Aspose.OCR for .NET API. You can easily create a scanned PDF to Excel converter with OCR by downloading the DLL file from the New Releases section, or with the NuGet installation command below:

PM> Install-Package Aspose.OCR

Convert Scanned PDF to Excel Programmatically in C#

You can convert a scanned PDF document to an Excel file with OCR by following the steps below:

  1. Instantiate AsposeOcr class object.
  2. Specify a DocumentRecognitionSettings class object.
  3. Recognize the scanned PDF file with RecognizePdf method.
  4. Save output Excel file using SaveMultipageDocument method.

The following code sample explains how to convert a scanned PDF to Excel using C#:

Get Free Evaluation License

You can evaluate the feature of converting scanned PDF to Excel in its full capacity by requesting a free temporary license.

Conclusion

In this article, you have checked how to convert a scanned PDF file to Excel by applying OCR operations to recognize the text optically. This can be helpful in scenarios like when a CSV file is scanned by a scanner and a PDF file is produced. You can convert it to an Excel file programmatically using C#. Furthermore, you can take a look at other OCR-related features offered by the API by going through the documentation. In case of any queries, please feel free to contact us at the forum.

See Also