This article provides the simplest way of extracting plain text from the Word DOCX or DOC files in your Python applications. After reading this article, you will learn how to convert a DOCX or DOC file to TXT in Python.

Convert DOC DOCX to TXT in Python

MS Word is a popular word processing application that allows you to create rich text documents. A wide range of documents are being created in MS Word including invoices, technical documents, reports, and so on. DOC and DOCX are the file formats that MS Word uses to store the documents.

As a programmer, you may need to process a bunch of Word DOC/DOCX files to extract the plain text from within your Python applications. So let’s see how to perform DOC or DOCX to TXT conversion in Python.

Python DOCX to TXT Converter - Free Download

Aspose.Words for Python is an amazing library with a wide range of features to manipulate popular text documents including DOC and DOCX. The library eases the way of processing and retrieving text from the Word documents. Therefore, we will use this library to convert the DOC/DOCX files to TXT format.

You can use the following pip command to install Aspose.Words for Python in your application.

pip install aspose-words

How to Convert DOCX to TXT in Python

Aspose.Words for Python simplifies the DOCX to TXT conversion that you can perform within a couple of steps, as mentioned below:

  • Load the DOCX file from disk.
  • Save DOCX as TXT format to desired location.

You do not need to parse the whole Word document page by page or line by line to extract the text from it. Let’s now have a look at how to perform these steps in Python to convert a DOCX file to TXT format.

Save DOC as TXT in Python

The following are the steps to save a DOC or DOCX file as TXT in Python.

  • Load the DOC file using Document class.
  • Save DOC as TXT using Document.save(filePath) method and pass the file’s path as parameter.

The following code sample shows how to convert a DOC to TXT in Python.

Python DOC to TXT Converter - Get a Free License

You can use a free temporary license to convert DOC files to TXT format without evaluation limitations.

Conclusion

In this article, you have learned how to convert DOC or DOCX files to TXT format in Python. With the help of code sample, you have seen how to load and save DOCX files as TXT to desired location in Python. Besides, you can visit the documentation of Aspose.Words for Python to explore more about the library. In case you would have any questions, feel free to let us know via our forum.

See Also