Convert DOC DOCX to TXT in Python

As a programmer, you may need to process a bunch of Word DOC/DOCX files to extract the plain text from within your Python applications. This article provides a powerful, high-quality, and simple solution for extracting plain text from Word DOCX or DOC files in Python. Ultimately, you will learn how to convert a DOCX or DOC file to TXT in Python.

MS Word is a popular word-processing application that allows you to create rich text documents. A wide range of documents is being created in MS Word including invoices, technical documents, reports, and so on. So let’s see how to perform Word to TXT conversion in Python.

Python DOCX to TXT Converter

For Word to TXT conversion, we will use Aspose.Words for Python. It is an amazing library with a wide range of features to manipulate popular text documents including DOC and DOCX. The library eases the way of processing and retrieving text from Word documents. You can also use this library and convert Word to TXT for free.

You can use the following pip command to install Aspose.Words for Python in your application.

pip install aspose-words

How to Convert DOCX to TXT in Python

Aspose.Words for Python simplifies the DOCX to TXT conversion that you can perform within a couple of steps, as mentioned below:

  • Load the DOCX file from disk.
  • Save DOCX as TXT format to the desired location.

You do not need to parse the whole Word document page by page or line by line to extract the text from it. Let’s now have a look at how to perform these steps in Python to convert a DOCX file to TXT format.

Save Word DOC as TXT in Python

The following are the steps to save a DOC or DOCX file as TXT in Python.

  • Load the DOC file using Document class.
  • Save DOC as TXT using Document.save(filePath) method and pass the file’s path as a parameter.

The following code sample shows how to convert a DOC to TXT in Python.

Free Python DOC to TXT Converter

You can use a free temporary license to convert DOC files to TXT format without evaluation limitations.

Explore Word to TXT Converter

You can visit the documentation of the Python Word library to explore other features. In case you would have any questions, feel free to let us know via our forum.

Conclusion

In this article, you have learned how to convert DOC or DOCX files to TXT format in Python. With the help of a code sample, you have seen how to load and save DOCX files as TXT to the desired location in Python.

See Also