Skip to main content

Journey of our Project

Inspiration 

Terminal Apps For Visually Impaired people takes the inspiration from various other projects here at IIT Delhi to facilitate Visually Impaired People. Our project idea gets its motivation from the very basic habit of human society i.e reading books, everybody wants to read books but the obstacle arises in case of Visually Impaired People as they have to handle those bulky books written in Braille. So, we tried to fix this problem and to provide them with an efficient solution of reading Digital text formats with the help of Refreshable Braille Display.

Our Approach

The first task on our hand was to unfold the various documents into a readable structure as all those documents have their own archive type or text scripting format so to make it manipulatable we trans-compiled all those document formats into one common format i.e XML which made it easier for us to modify the document format.
After studying various file structures we convert all of them into a common markup format so that it can be handled easily.
For the common markup, we chose XML as it is easily accessible for further modifications. For a detailed description about XML click here.
At this point in time, we had a structured common document format which we had to manipulate using some common norms.
Next thing we achieved was that we used python for controlling the text flow on the terminal or bash interface of a Linux/Unix Operating System which was a necessary task to proceed further in the project.
After getting raw output on the terminal interface we had to look for all possible data blocks and label them as their block type such as heading, table, image, subheading etc.
Till this stage, we had output on our terminal interface which resembles the data in original documents and can be handled using a Class structure to manipulate on further levels. After converting the documents into a common markup format we built a class structure resembling the text in the exact same format as in the original documents
For gaining full control over text flow we designed class structure that can provide data on specific method calls i.e. for getting next paragraph or to navigate to next heading and much more.
We have the above text as our output from the program which contains all information about the original document and can be accessed using simple method calls of our document object model.

Problems and Challenges

  • Misalignment of the text boxes and discontinuous object placements in various pdfs caused difficulty in the parsing of pdf documents.
  • Many properties of various texts were lost.
  • Larger files cannot be parsed as whole in one go due to memory constraints.(50 MB available)
  • Identifying tables and making the data navigable, as the parser being used did not have such provision.

Our Solution 

For tackling the memory constraint we decided to process only the current chunk of data and for managing that we had to go all the way back to the original file and remember the position which we managed to do successfully for ePub. For pdf and daisy, we still have to rely on the file size which is one of the limitations of our application.
Tables are now being identified with the help of layout line tags and a simple condition of a polygon to be a rectangle. Also, the text streams have position coordinates as well which we match with table cells and access the data in the table. Navigation is by standard up/down/left/right keys. 

Popular posts from this blog

Open House Demonstration

Brief Description of Our Project

Our Project focuses on providing utility to the pre-existing Braille Display Device developed in IIT Delhi. Terminal Apps takes the idea of RBD(Refreshable Braille Display) on an application perspective so as to facilitate our users with the ability to read digital document formats like PDF, EPub, and Daisy Book.