PDF files are popular because they support text, images, animations, videos, and many other annotations. However, the text is the most important part of the majority of PDF documents. In this article, we will convert PDF to TXT file and TXT file to PDF format using C# .NET. This conversion comes handy in situations where you are concerned with only the text contents of PDF documents. Let us go through the following headings for an overview of upcoming topics:
- TXT to PDF or PDF to TXT Converter API
- Convert PDF to TEXT File without Formatting using C# or VB.NET
- Convert PDF to TXT File with Formatting Routines using C# or VB.NET
- Convert TXT File to PDF Programmatically using C# or VB.NET
TXT to PDF or PDF to TXT Converter
PDF to TXT conversion as well as TXT file to PDF conversion is very helpful when your main concern is the text string used in the document. You can convert these file formats into one another with few simple steps using Aspose.PDF for .NET API. Being a .NET framework based API, it lets you work with C# as well as the VB.NET programming language. You can easily install the API in your .NET applications either by downloading the DLL files from Downloads or via the NuGet gallery.
After installing the API successfully, let us proceed to convert a PDF file to TXT file with two different approaches:
Convert PDF to TEXT File without Formatting using C# or VB.NET
First of all, we will convert the PDF to Text without any formatting routine. The text contents will be converted in their as-is form. So the output text will not follow any formatting as of the input PDF file. You need to follow the steps below to convert PDF to TXT with great efficiency and reliability.
- Load input PDF document
- Initialize an instance of StringBuilder Class
- Iterate through each page of the PDF document
- Read Text using TextDevice and Raw mode
- Save output Text as a TXT file
The code snippet below shows how to convert PDF to TXT file using C# or VB in .NET Framework:
Convert PDF to TXT File with Formatting Routines using C# or VB.NET
Now, let us consider the use case where you need to convert the PDF to Text with little bit of formatting routines. For example, paragraph indentations, tabs, styles, or column-wise formatting. You can easily render the text contents of a PDF document to a TXT file with C# by following the steps below:
- Load source PDF File
- Initiate a string variable
- Read through each page with TextFormattingMode.Pure
- Save converted TXT file
The following code snippet shows how to convert PDF to TXT file with formatting using C# or VB.NET language:
Visual Comparison of PURE and RAW Text Conversion
The following screenshot is a visual comparison of the two approaches we have just discussed. You can notice that Pure mode (right-most window) shows the Text with the same formattings as in the PDF file (left-most window).
This screenshot of input and output files lets you decide which conversion approach suites you the best.
Convert TXT File to PDF Programmatically using C# or VB.NET
TXT files often contain huge text contents. You can easily convert a TXT file to PDF file with Aspose.PDF for .NET API. Simply follow the steps below to perform text to PDF conversion:
- Create an instance of TextReader class
- Initialize a PDF document and Add Blank Page
- Instantiate TextBuilder object
- Read each line of text from input TXT file
- Save output PDF file
The code snippet below explains how to programmatically convert a TXT file containing text, to a PDF document using C# or VB.NET language:
In this article, we have learned and explored the conversion of the text in PDF file to the TXT file format. Moreover, we have converted the text in a TXT file to a PDF document using C# or VB in .NET Framework. You can convert PDF and TXT files quite efficiently and quickly. However, if you have any concerns or queries, please feel free to write back to us at Free Support Forum or explore Product Documentation. We would love to hear from you!