Skip to main content

Proposed Workflow

General Workflow

Our application workflow takes input as a file and after that our program converts it to XML and process it to represent data on the terminal Interface 







EPub

In case of ePub, we have used ePub Python Library to get content from the epub file after which our application makes the Class Structure to handle and manipulate various aspects of a document

After that, we have information of the various content files present in an ePub document. These files have meta-structure in form of XML and content in form of HTML, for post-processing of these HTML we used Beautiful Soup Library and converted texts into navigable blocks.
For the navigation purpose, we made another class Structure which handles all the navigation requests and respond accordingly.
And hence our output comes straight to the terminal Interface as

PDF

In PDF we have used PDFMINER library to get the data from the pdf file and convert the data into the XML format.We have made various functions to handle different requests.

For example this the header content search function it is called when the user wants to search some specific data within the header.
This is the list of all the headers in the document and can be printed on demand of the user.
 Search function to search any particular heading, paragraph or word in the whole text.





Output to the terminal interface is same as Epub file output.So with the help of these function and libraries, we execute the PDF part. 

Daisy books

For daisy files, firstly we had to unzip the daisy resource folder using the python unzip library. We have parsed the whole XML file as an element tree using python XML.elementTree library. And then to get contents out of the XML content file, we have again used python beautiful soup library.
This allows us to parse large documents efficiently & fast. We have included various navigation functions in the file like previous/next page/para/header, going to a specific page/para/header, searching and replacing text, printing table of contents etc.
Also, we have done paragraph wise parsing of content because RBD device has only storage memory up to 50-100 MB.

Popular posts from this blog

Open House Demonstration

Journey of our Project

Inspiration  Terminal Apps For Visually Impaired people takes the inspiration from various other projects here at IIT Delhi to facilitate Visually Impaired People. Our project idea gets its motivation from the very basic habit of human society i.e reading books, everybody wants to read books but the obstacle arises in case of Visually Impaired People as they have to handle those bulky books written in Braille. So, we tried to fix this problem and to provide them with an efficient solution of reading Digital text formats with the help of Refreshable Braille Display. Our Approach The first task on our hand was to unfold the various documents into a readable structure as all those documents have their own archive type or text scripting format so to make it manipulatable we trans-compiled all those document formats into one common format i.e XML which made it easier for us to modify the document format. After studying various file structures we convert all of them in...

Brief Description of Our Project

Our Project focuses on providing utility to the pre-existing Braille Display Device developed in IIT Delhi. Terminal Apps takes the idea of RBD(Refreshable Braille Display) on an application perspective so as to facilitate our users with the ability to read digital document formats like PDF, EPub, and Daisy Book.