Data Extraction From Images in Java

Overview

Optical mark recognition (OMR) is an electronic process that facilitates the reading and capturing of data marked by people on specially designed document forms, such as tests or surveys, which include bubble or square inputs filled by users. By using data extraction from images in Java, we can efficiently handle scanned images of these survey forms, questionnaires, or test sheets, making it possible to read the user inputs programmatically. This article will guide you on how to perform OMR and extract data from images using Java.

The following topics shall be covered in this article:

  1. Java OMR API to Extract Data from Image
  2. Data Extraction From Images in Java
  3. Perform OMR and Extract Data from Multiple Images
  4. Extract OMR Data with Threshold
  5. Extract OMR Data with Recalculation

Java OMR API to Extract Data from Image

To perform OMR operations and data extraction from images in Java, we will use the Aspose.OMR for Java API. This powerful tool enables the designing, creating, and recognizing of answer sheets, tests, MCQ papers, quizzes, feedback forms, surveys, and ballots.

The OmrEngine class within the API is responsible for creating templates and processing images. Its getTemplateProcessor(String templatePath) method initializes a TemplateProcessor instance tailored for handling templates and images. To recognize an image, the recognizeImage(String imagePath) method can be utilized, which returns all OMR elements as a RecognitionResult class instance. Using the getCsv() method, you can generate a CSV string containing the recognition results. Additionally, the recalculate(RecognitionResult result, int recognitionThreshold) method refines the recognition results with customized parameters.

Please either download the JAR of the API or add the following pom.xml configuration in a Maven-based Java application.

<repository>
    <id>AsposeJavaAPI</id>
    <name>Aspose Java API</name>
    <url>http://repository.aspose.com/repo/</url>
</repository>
<dependency>
     <groupId>com.aspose</groupId>
     <artifactId>aspose-omr</artifactId>
     <version>19.12</version>
</dependency>

Data Extraction From Images in Java

To carry out an OMR operation, we require the prepared OMR template file (.omr) and the image of user-filled forms or sheets. The process of data extraction from images in Java, specifically using OMR operations, involves the following steps:

  1. Firstly, create an instance of the OmrEngine class.
  2. Next, call the getTemplateProcessor() method and initialize a TemplateProcessor class object, passing the OMR template file path as an argument.
  3. Then, get the RecognitionResult object by calling the recognizeImage() method with the image path as an argument.
  4. After that, obtain recognition results as CSV strings using the getCsv() method.
  5. Finally, save the CSV result as a CSV file on the local disk.

The following code sample demonstrates how to perform data extraction from images in Java by converting OMR data into CSV format.

// This code example demonstrates how to perform OMR on an image and extract data
// OMR Template file path
String templatePath = "C:\\Files\\OMR\\Sheet.omr";
// Image file path
String imagePath = "C:\\Files\\OMR\\Sheet1.png";
// Initialize OMR Engine
OmrEngine engine = new OmrEngine();
// Get template processor
TemplateProcessor templateProcessor = engine.getTemplateProcessor(templatePath);
// Recognize image
RecognitionResult result = templateProcessor.recognizeImage(imagePath);
// Get results in CSV
String csvResult = result.getCsv();
// Save CSV file
PrintWriter wr = new PrintWriter(new FileOutputStream("C:\\Files\\OMR\\Sheet1.csv"), true);
wr.println(csvResult);
Extract-Data-from-an-Image-in-Java

Perform OMR and Extract Data from an image in Java.

Please download the OMR template used in this blog post.

Perform OMR and Extract Data from Multiple Images

We can perform OMR operations on multiple images and extract data in a separate CSV file for each one, using the steps outlined earlier. To accomplish data extraction from images in Java, it’s necessary to repeat steps 3, 4, and 5 for all images individually.

Below is a code sample demonstrating how to extract OMR data from multiple images using Java.

// This code example demonstrates how to perform OMR on multiple images and extract data
// Working folder path
String folderPath = "C:\\Files\\OMR\\";
// OMR Template file path
String templatePath = folderPath + "Sheet.omr";
// Image file path
String[] UserImages = new String[] { "Sheet1.png", "Sheet2.png" };
// Initialize OMR Engine
OmrEngine engine = new OmrEngine();
// Get template processor
TemplateProcessor templateProcessor = engine.getTemplateProcessor(templatePath);
// Process images one by one in a loop
for (int i = 0; i < UserImages.length; i++)
{
String image = UserImages[i];
String imagePath = folderPath + image;
// Recognize image
RecognitionResult result = templateProcessor.recognizeImage(imagePath);
// Get results in CSV
String csvResult = result.getCsv();
// Save CSV file
PrintWriter wr = new PrintWriter(new FileOutputStream(folderPath + "Sheet_" + i + ".csv"), true);
wr.println(csvResult);
System.out.println(csvResult);
}
. The list data, whether bullet or numbered, must remain entirely unchanged.

Extract OMR Data with Threshold in Java

To perform Optical Mark Recognition (OMR) operations in Java, we utilize a threshold value between 0 and 100 based on specific requirements. This threshold value, used in data extraction from images in Java, dictates how strict the API will be in highlighting answers; a higher value increases the stringency. Adhering to the steps mentioned earlier is essential for OMR processing with the chosen threshold. Specifically, within step #3, the recognizeImage(string, int32) method must be called. This overloaded method requires the image file path and the desired threshold value as its parameters.

The following code sample demonstrates how to perform OMR with a threshold value using Java:

// This code example demonstrates how to perform OMR with therashold and extract data from an image
// OMR Template file path
String templatePath = "C:\\Files\\OMR\\Sheet.omr";
// Image file path
String imagePath = "C:\\Files\\OMR\\Sheet1.png";
// Threshold value
int CustomThreshold = 40;
// Initialize OMR Engine
OmrEngine engine = new OmrEngine();
// Get template processor
TemplateProcessor templateProcessor = engine.getTemplateProcessor(templatePath);
// Recognize image
RecognitionResult result = templateProcessor.recognizeImage(imagePath, CustomThreshold);
// Get results in CSV
String csvResult = result.getCsv();
// Save CSV file
PrintWriter wr = new PrintWriter(new FileOutputStream("C:\\Files\\OMR\\Sheet1_threshold.csv"), true);
wr.println(csvResult);
System.out.println(csvResult);

Extract OMR Data with Recalculation in Java

When dealing with precise data extraction from images in Java, especially concerning OMR, there might be a need to recalculate the results using different threshold values. By configuring the API, the recalculation can be automated through the TemplateProcessor.recalculate() method. This approach allows multiple image processing iterations by adjusting the threshold until the desired outcome is achieved. To successfully perform the OMR operation with recalculation, follow the steps below:

  1. Firstly, create an instance of the OmrEngine class.
  2. Next, call the getTemplateProcessor() method and initialize TemplateProcessor class object. It takes the OMR template file path as an argument.
  3. Then, get the RecognitionResult object by calling the recognizeImage() method with the image path as an argument.
  4. Next, export recognition results as a CSV string using the getCsv() method.
  5. Then, save the CSV result as a CSV file on the local disk.
  6. Next, call the recalculate() method. It takes the RecognitionResult object and the threshold value as arguments.
  7. After that, export recognition results as a CSV string using the getCsv() method.
  8. Finally, save the CSV result as a CSV file on the local disk.

The following code sample demonstrates how to perform OMR with the recalculation method using Java:

// OMR Template file path
String templatePath = "C:\\Files\\OMR\\Sheet.omr";
// Image file path
String imagePath = "C:\\Files\\OMR\\Sheet1.png";
// Threshold value
int CustomThreshold = 40;
// Initialize OMR Engine
OmrEngine engine = new OmrEngine();
// Get template processor
TemplateProcessor templateProcessor = engine.getTemplateProcessor(templatePath);
// Recognize image
RecognitionResult result = templateProcessor.recognizeImage(imagePath, CustomThreshold);
// Get results in CSV
String csvResult = result.getCsv();
// Save CSV file
PrintWriter wr = new PrintWriter(new FileOutputStream("C:\\Files\\OMR\\Sheet1.csv"), true);
wr.println(csvResult);
// Recalculate
// You may apply new threshold value here
templateProcessor.recalculate(result, CustomThreshold);
// Get recalculated results in CSV
csvResult = result.getCsv();
// Save recalculated resultant CSV file
PrintWriter finalWr = new PrintWriter(new FileOutputStream("C:\\Files\\OMR\\Sheet1_recalculated.csv"), true);
finalWr.println(csvResult);
.

Get a Free License

You have the opportunity to get a free temporary license for trying out the library without evaluation limitations. This is a great way to explore features such as data extraction from images in Java, allowing you to fully evaluate its capabilities. The list data below remains unchanged for your reference:

  1. The library works efficiently with large volumes of data.
  2. Integration with existing systems is seamless.
  3. Data extracted is highly accurate and reliable.
  4. Installation steps are straightforward and well-documented.

Conclusion

In this article, we have learned how to:

  • perform OMR operation on images;
  • extract data in CSV format programmatically;
  • apply threshold setting while performing OMR on images;
  • recalculate OMR results in an automotive process using Java.

Additionally, when dealing with data extraction from images in Java, you can explore more about the Aspose.OMR for Java API by reviewing the documentation. If you encounter any issues, please feel free to contact us on our free support forum.

See Also