Background

No matter how far technology has progressed, it still cannot displace conventional information-sharing mediums like blackboards and notebooks that have always been a part of exchanging ideas, teachings and discussions. In this case, the alternative may be to converge our technological developments towards supplementing these tools instead of replacing them.

One way to achieve this is through automatic conversion of paper notes and blackboard content to PDFs. This may be done using Handwriting Recognition (HWR), a branch of Optical Character Recognition (OCR) and Machine Learning (ML), through which written characters are scanned and compared with a predefined set of characters. Once the matching character is found, we add it to the PDF. Further, the technology can also detect and crop diagrams, formulae, mind-maps and other graphical content from the blackboard through Optical Shape Detection to maintain the same level of information on the PDF as in its offline form.

The entire project is developed using National Instrument’s LabVIEW software. By the companion app of the project, images clicked on any standard smartphone can be automatically converted into PDF documents that can be easily read from and shared among a large group of people. It aims to revolutionize the archaic way ideas are shared today while ensuring that economic feasibility and practicality are involved.

Steps to create a blackboard note PDF

  1. Click an image of the blackboard using a scanner application on a smartphone.
  2. Add the scanned image to the LabVIEW processing queue directory.
  3. Wait for the Conversion Complete alert.
  4. View the exported file on any PDF viewer.
The LabVIEW front panel of the application

The LabVIEW front panel of the application

Dependencies

  1. A photo scanner mobile application; for capturing images of blackboard/notebook. We used the scanner function of the Dropbox Android app.
  2. A mobile-desktop file synchronization application; for adding new images to the conversion queue. We used Dropbox for this purpose.
  3. NI Vision Development Module; for Computer Vision operations on LabVIEW.
  4. Exaprom PDF 2.0 LabVIEW module; for exporting parsed text and images into PDF.

Handwriting Recognition (HWR)

HWR enables computers to parse comprehensible handwritten input from paper documents, photographs, touch screens and other devices. However, for recognizing arbitrary handwritten text, the detection system needs to be ’trained’ to do so. This can be done by feeding the system with a dataset of handwritten text images and their corresponding characters. Different individuals write each dataset to cover as many varieties of unique handwriting available as possible. For this purpose, we sought the help of our classmates and faculty; they were asked to write multiple instances of each alphabet, digit and special characters like the period and the comma on a training sheet we had formulated. Subsequently, this sheet was scanned and fed for training.

The IMAQ and Vision Assistant components of the NI Vision Development Module provide an easy interface for training and detection of handwritten text. Hence, we used the same for the application.

The In-house Shape Detection Algorithm

Textual content may not be the only thing written on a blackboard; they are usually accompanied by graphical information like diagrams, plots, illustrations, etc. The exported PDF notes must also contain these data along with the text. This was achieved by designing a shape detection algorithm from scratch.

For successful graphic detection by LabVIEW, the user needs to demarcate all the graphical content on the blackboard by drawing a rectangular border that surrounds them. Initially, the shape detection algorithm creates a binary version of the scanned image through thresholding. The fill-hole operation is then used to remove the subtler details present on the binary image. Subsequently, the coordinates of all the borders available on the image are located. Finally, the identified coordinates of each border are used to crop the graphical content it encloses on the original image.

The LabVIEW block diagram of the application

The LabVIEW block diagram of the application

Current State of the Project

The current output of the project is nearly 80% of what we had expected. Each graphical content in a scanned image is cropped and placed correctly in the PDF. Even handwritten text is detected and parsed with high accuracy.

All correct example In the scenario above, text and objects are fed. The text is converted into string format, and the rectangular boxes are cropped and exported as images.

Not all correct example In the scenario above, some of the text is decoded incorrectly. However, the graphical content is getting exported as desired.

Some Points to Consider

Currently, the application hardly takes a few seconds to parse one image but is not cent per cent accurate; Any inaccuracy in detecting handwritten characters and elements may be due to:

Poorly lit image

If the image illumination is not appropriate, character recognition becomes difficult. E.g., let’s say a user writes the letter t; if the top bar is not detected accurately due to illumination issues, the software may incorrectly parse this character as l or 1.

Blurry photos

may not parse it correctly. Additionally, the software uses edge detection to locate the boundaries. If boundaries are not detected accurately, some of the graphical content might get missed out.

Variations in handwriting

There is no unique handwriting; it could vary from person to person and can be different even for the same person on separate occasions. Additionally, some may prefer intricate writing styles like using cursive letters. These can cause difficulties while reading a character.

Insufficiencies of LabVIEW OCR

The OCR module in LabVIEW is intended to be used for detecting industrial barcodes. Hence, there is no facility to parse spaces in handwritten text. Since LabVIEW is closed-source, there is no way to modify the parsing logic unless NI explicitly supports it.

Scope for Improvement

Using a predictive spell check

Whenever any word is read, it will pass through a spell checker. Firstly, it will check whether the word exists or not. If there is an error, it will return the word which is closest to the context. Additionally, it will also tag the handwritten characters in question to the characters in the knowledge base. For, e.g., Consider that the user has written the word ‘apple’. However, due to some errors, the word is read as abple. This word will then pass onto the spell checker and be converted to apple; the handwritten character incorrectly parsed as b would then be stored to be read as p in the future.

Using an alternate technology stack

The entire application can be reimplemented in Python for further development. In recent times, Python has been the go-to language for ML with a wide variety of high-quality libraries like TensorFlow and PyTorch for this purpose. Python also has powerful libraries to implement the GUIs required for our software’s desktop and mobile clients. Additionally, the volume of online resources and community support for Python is significantly higher than for LabVIEW. Being open-source, Python would also give us the freedom to modify dependencies for optimizing the application as required.

Additional Resources

  1. Git Repository; Licensed under the GNU General Public License v3.0
  2. Semester Project Report