Extract Text and Images from OneNote in C#

We collect, organize, and collaborate on notes and ideas in OneNote documents. OneNote, a popular note-taking application by Microsoft, allows users to organize their notes, images, and other content seamlessly. However, extracting valuable information from OneNote documents for further processing or integration into other applications can be challenging without the right tools. In certain cases, we may need to extract text or images from OneNote documents programmatically without using MS OneNote. In this article, we will learn how to extract text and images from OneNote in C#.

In this comprehensive guide, you will learn:

By the end of this tutorial, you will have a solid understanding of how to efficiently extract text and images from OneNote documents using C#, empowering you to streamline your data management processes and maximize the utility of your OneNote content.

C# API to Extract Text and Images from OneNote

For extracting text and images from the OneNote document, we will be using the Aspose.Note for .NET API. It is a feature-rich OneNote document manipulation API that lets you create, read, and convert OneNote documents programmatically. Please either download the DLL of the API or install it using NuGet.

PM> Install-Package Aspose.Note

Step-by-Step Guide to Extract Text from OneNote Documents in C#

We can easily extract all the text from the OneNote document by following the steps given below:

  1. Firstly, load a OneNote file using the Document class.
  2. After that, call the GetChildNodes method with RichText as NodeType to extract text.
  3. Finally, show the extracted text.

The following code sample shows how to extract all the text from a OneNote file using C#.

Extract All the Text from OneNote Documents.

Extract All the Text from OneNote Documents.

Extract Text from Specific Pages of OneNote in C#

We can extract text from specific pages of the OneNote document by following the steps given below:

  1. Firstly, load a OneNote file using the Document class.
  2. Next, call the GetChildNodes method with Page as NodeType to extract pages.
  3. After that, get a list of text items using the GetChildNodes method with RichText as NodeType.
  4. Finally, show the extracted text.

The following code sample shows how to extract text from a specific page of a OneNote file using C#.

Step-by-Step Guide to Extract Images from OneNote Documents using C#

We can also extract images from the OneNote document by following the steps given below:

  1. Firstly, load a OneNote file using the Document class.
  2. After that, get a list of images using the GetChildNodes method with Image as NodeType.
  3. Finally, show the image properties and save to local disk.

The following code sample shows how to extract images from a OneNote file using C#.

Extract Images from OneNote Documents.

Extract Images from OneNote Documents.

Get a Free License

You can get a free temporary license to try the library without evaluation limitations.

Conclusion

In this article, we have learned how to extract text from the OneNote document or from specific pages of the document. We have also seen how to extract images from OneNote documents programmatically. Besides, you can learn more about Aspose.Note for .NET API using the documentation. In case of any ambiguity, please feel free to contact us on the forum.

See Also