Please use this identifier to cite or link to this item:
https://elibrary.khec.edu.np/handle/123456789/1017
Title: | RAN2DEV: CONVERTING RANJANA LIPI TO DEVANAGARI (With Modifiers: "Aakar" & "Dirgha Ikar") |
Authors: | Nekesh Koju Nischal Baidar Sarif Tachamo Surag Basukala (770321) (770323) (770338) (770346) |
Advisor: | Er. Shree Ram Khaitu |
Keywords: | Keywords: Ranjana Script, Optical Character Recognition, Ran2Dev Model, MLflow, TensorBoard, Modifiers, Mobile Application |
Issue Date: | 2025 |
College Name: | Khwopa Engineering College |
Level: | Bachelor's Degree |
Degree: | BE Computer |
Department Name: | Department of Computer Engineering |
Abstract: | Newar (Nepal Bhasa) is the language of the Newar people, the original inhabitants of the Kathmandu Valley. It is an important part of Nepal's history and culture. The language developed from Pali and was traditionally written using different Brahmic scripts, with the Ranjana script being one of the most prominent since the 7th century. Unfortunately, during 1990s, the use of Nepal Bhasa declined dramatically, from about 75% of the population to just in 44% in the early 1990s. As of 2021, there are around 1.3 million Newars, with 880,000 native speakers. In a positive move, Bagmati Province officially recognized Nepal Bhasa in 2024, helping to revive and preserve the language. In our 7th semester, we focused on building an OCR (Optical Character Recogni tion) system that can read Ranjana script and convert it into Devanagari text for both word and character-level prediction. We trained our models using a dataset of over 20,000 samples. The LeNet-5 model, using 32 � 32 input images achieved 99.10% accuracy, while our custom CNN, Ran2Dev with 64 � 64 inputs performed even better at 99.74%. We used MLflow to track epoch wise accuracy and loss, compare training runs visually and quantitatively, and manage our experiment history. Finally, we deployed the system as a web app on the Render platform so anyone worldwide could try it online. In the 8th semester, we improved the system to support the Aakar and Dhirgha Ikar modifier, which is essential for full-word OCR accuracy. We discovered three distinct visual forms of aakar in Ranjana script, so our team members hand-wrote and collected 1,500 samples of each form to build a balanced modifier dataset. We also reevaluated both LeNet-5 and Ran2Dev using the same 32 � 32 input size to compare their performance under identical conditions. To further boost accuracy and generalization, we developed Ran2Dev V2, featuring Max Pooling combined with Global Average Pooling, dropout layers (0.25�0.4) to reduce overfitting, batch normalization for faster convergence, and dilated convolutions to capture broader contextual features. Ran2Dev-V2 outperformed both the Ran2Dev and the LeNet-5 models, achieving an accuracy of 99.89%. This marks a relative improvement of 0.15% over Ran2Dev (99.74%) and 0.79% over LeNet-5 (99.10%). We also have created both web and mobile application to bring our OCR system directly to users devices. Overall, these enhancements make our OCR pipeline more robust, accurate, and accessible for preserving the Ranjana script and supporting the Newar language. |
URI: | https://elibrary.khec.edu.np/handle/123456789/1017 |
Appears in Collections: | PU Computer Report |
Files in This Item:
File | Size | Format | |
---|---|---|---|
RAN2DEV CONVERTING RANJANA LIPI TO Devnagari.pdf Restricted Access | 9.82 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.