A LINGUAL AGNOSTIC INFORMATION RETRIEVAL SYSTEM

₦ 10,000.00
i h

Abstract

The growing global internet access and its influence on individuals and economic development of nations is yet to reflect in the linguistic diversity of the world’s 7000 languages. Today, close to 4-billion native speakers of low-resource languages are non-active cyber participants. Obviously, this number is too significant to be excluded from cyber participation. However, prior efforts made at addressing this problem of cyber exclusion of natives have been socio-economic in nature, culminating in training, empowerment, and digital access with the indelible hurt of language inequities. Since cyber participation is basically through human interaction with cyber-applications in a human language, encapsulating these applications for interaction in any human language will help prevent the cyber exclusion of natives and evade the hurt of language inequities. In particular, although Information Retrieval System (IRS) remains a critical cyber application and a cyber-participation gateway, no IRS exists that supports monolingual natives of under-resource languages search and retrieval of information from the web in their native languages regardless of the original language of information preservation on the web; not even a feasible architectural blueprint. This study therefore, designed and established the feasibility of a lingual agnostic IRSs architectural blueprint that will aid mono-lingual natives’ participatory access to the cyber space.

The study adopted the design science research methodology. The lingual agnostic IRS architecture introduced was designed on the principle of transparency on user language detection and information translations, and caching. The detailed design of the architecture was done using the unified modeling language. The designed IRS architecture has been implemented using the agile and the component based software engineering approaches. In particular, the language of implementation was the Java programming language. The deployment environment was the NetBeans 5.1. The default language of the IRS was English language and the selected native languages were other common languages in Nigeria with software with bidirectional translation software with English language. These selected languages were Hausa, Yoruba, Igbo, and Arabic. The resultant IRS system has been evaluated using heuristics and systems evaluation methods for parity of language of interaction against the default English language on a corpus of 86 English language documents (Journal article) from the BADALA journal archive, stored as .txt flat files.

The designed lingual agnostic IRS has been shown to be excellently stable across queries and languages, guaranteeing 86% parity with the default English language in the use of the selected languages for information access and retrieval. This parity of language interaction can be as high as 90 % for most of the selected languages. The implication is that the designed lingual agnostic IRS architecture is capable of mitigating the language inequities bedeviling cyber-participation of mono-lingual natives by about 90% i.e. with such IRS deployed; monolingual natives can participate, contribute or benefit from the cyberspace with not more than 10% additional strain as their English Language counterparts. Furthermore, it has been shown that LAIRS is the most appropriate IRS for addressing the problem of language barriers to cyber-inclusion compared to existing IRSs

0.0 0
Write your own review Close
  • Only registered users can write reviews
*
*
  • Bad
  • Excellent
*
*
*
Only registered users can write reviews