1. INTRODUCTIONIn this chapter the background ofthe research and the rationale initiated to do this research has beenpresented. It presents the overall objective, methodology, significance of thestudy, scope, and limitation of the thesis and problem that has to be addressedfinally the organization of the thesis is presented.1.1. BackgroundSince the Internet was introducedin 1969, it has grown rapidly and is expected to continue to do so over the years.The World Wide Web is a very popular and interactive information resourceacting as a storehouse for image, text, audio, video, and metadata.
The amountof information available on Web is very huge and noisy. The noise comes fromtwo major sources. First, an emblematic Web page contains many pieces ofinformation, e.g.
, the main content of the page, advertisements, copyrightnotices, routing links, privacy policies, etc. Second, due to the fact that theWeb does not have quality control of information, i.e., one can write anythingthat one likes. The popularity of WWW is largely dependent on the searchengines.
A web search engine is a software system that is designed to searchfor information on the World Wide Web. Now anyone can quickly search forhelpful direction tips, personal information, recipes, vacancy, pictures,organizational websites and more with search engines. Search engines consist offour discrete software components: Crawler or Spider: This is the part of thesearch engine which combs through the pages on the internet and gathers theinformation for the search engine.; Indexer: a blender like program thatdissects web pages that are downloaded by spiders; The database: The searchengine’s database is what you are actually searching. All of the informationthat a web crawler retrieves is stored in a database. Every time you use asearch engine, it is this database you are searching, not the live internet Figer1: architecture of standard web engine.
Thisvast amount of data is making search more and more difficult with traditionalsearch engine as they return huge data for a given query which is consisting ofrelevant as well as irrelevant data. 23 This not only results in wastage ofuser time but also leads to data overload problem. So, users are not satisfiedwith searching the information by traditional search engine.
Sothe problem of re-ranking search pages or results has become one of the mainproblems in IR field. Exactly what information the user wants is unpredictable.So the web page ranking algorithms are designed to anticipate the userrequirements from various static (e.g., number of hyperlinks, textual content)and dynamic (e.
g., popularity) features. Variousranking algorithms developed are Page Rank, Weighted Page rank, Page ContentRank, HITS, SALSA, SUBSPACE HITS, SIMRANK etc.
Most of these algorithms areeither based on web structure mining or web content mining. Web content miningextracts useful information from the content of web documents whereas webstructure mining is used to set links between references and referents in theweb 4. In this paper, Enhanced Weighted Page Rank (EWPR) is being proposedfor search engines that works on the basis of Weighted Page Rank algorithm andtakes into account Weight factor (WF). The important purpose of this proposedalgorithm is to find more relevant information as per the queries of the user5.Thereis one group of users, however, that have been largely ignored in the rush touse the WWW.
These are blind or visually impaired users, who have particularproblems in accessing web material. Whereas accessibility issues to sightedusers may be a matter of response time, getting lost in cyberspace, or lessperformance of search engine retrieval, these are issues that may beinconvenient, but they are not challenging problems. The considerations are notthe same for a blind or visually impaired user trying to access the sameinformation. Problems of accessibility to a person who is visually impairedcovers all those for sighted users, plus a number that are unique to this groupof users. These include the issue of screen design, the use of font size,color, the use of patterns in screen backgrounds that make the text difficultto read and an excess of graphics. These features, designed to be appealing tothe sighted user, may make Internet pages inaccessible to a visually impaireduser1.2. Statement of the problemAccording to Central Statistical Agency (CSA)of Ethiopia there are about 800,000blinds and more than 1.
4 low visions learners which are deprived of the Internet. The visually impairedlearners are left to learn using the conventional method of “talk and Braille”by the teacher. They cannot use the functionalities like “linking” of web pagesthrough the available browsers with their local language to acquire the variousinformation and knowledge from the web. The visually impaired learners are alsodeprived of enjoying services in the Internet like sending and receivinge-mails unlike their other normal friends. Most importantly due to rapid development of the internet and exponential growth of informationamount it has been difficult for the visually impaired to search the relevantinformation from search engines. The web has become difficult for users toextract and filter the information that is more relevant. Since the visuallyimpaired users try to access the internet through voice recognition mechanismand search results are provided to them by converting the contents of the pagesto voice and reading the result to them. Because of this result from searchengines should have to be re-ranked by giving a priority to those relevantpages and page that are not full of graphic contents.
Discussingon the capabilities of the World Wide Web (WWW) to everyone’s daily activitiesespecially in online education, there is still some limitation to people with disabilities,especially the visually impaired learners to access information. To avoid socialexclusion for people with disability, web accessibility is a requirement forwebsites. Today, there are large numbers of websites which fail to meet therequirements of web accessibility.
Therefore, this research would focus on the accessibilityrelated to the development of Amharic voice browser plug-in that facilitatesthe visually impaired in seeking information via internet.1.3. The Objectives of the Study1.3.1.
GeneralobjectiveTheobjective of this paper is to focus on a study of the usability and accessibilitytopic for developing Amharic voice-based browser plug-in as an assistive toolthat facilitates the visually impaired in seeking information via Internet.More importantly to enablevisually impaired user access the relevant web page by enhancing the existingcontent based relevancy algorithm so that web pages that are relevant and whichcan be easily synthetized to voice will be ranked in priority. By developing thisinteractive Amharic voice recognizer browser Plug-in and integrating the enhanced algorithm with the Plug-in to facilitates thevisually impaired learners accessing information through the Internet as a partof accessing to virtual learning 1.3.2. Specificobjectivesü To explore the challenges faced by thevisually impaired learners in accessing virtual learning environment.ü To design and develop a browser plug-in thatenables them to browse the Internet through Amharic and English voicerecognition system.ü To enhance existing content based web pagere-ranking algorithm to re-rank search results from search engines which areaccessible and relevant to the visually impaired users.
ü Thisbrowser plug-in is an accessible browser plug-in that allows them to navigatethe Internet with less complexity by using a medium of speech for alternativeinput and output. 1.4. Research Specific QuestionsThe proposed researchshall implicitly discuss the following exploratory questions like:ü Whether the application of Amharic voice-basedweb searching concept as a replacement to the conventional method of learnusing “talk and Braille”.ü The advantage of using the Amharic voice-basedweb searching system in translating documents in terms of the accuracy andprecision as compared to existing manual Amharic-English Dictionary.
The research questionsfor the proposed research are:(i) How can we evaluate the performance ofproposed enhanced algorithm in terms of page re-ranking results?(ii) Whetherthe enhanced Content Based Web Page Re-Ranking Using Relevancy Algorithmprovide the relevant page to the visually impaired users.(iii) Whether Amharic voice-based web searching isuseful in filling the gap between the visually impaired and normal web users. 1.
5. Theoretical / Conceptual FrameworkThe theoretical framework for the study wasconstituted by quite a large amount of experience reports on web pagere-ranking algorithms. Reviewing the literature, it is found that there was amismatch between the page rank provided by search engines and the interests ofthe visually impaired users. The companies work with general page rankingalgorithms to rank web page in the World Wide Web their ranking, which areaimed with manual user quires rather than interest of specific user. First muchof existing search engines focus on manual keyboard query insertion forsearching and the results from searching are not relevant for visually impairedusers since this users access the result as voice output using some methods.Secondly even if there are some search engines with voice searching, theresults or the ranking of the search engines are the same with manual key boardsearching which are intended for normal users.1.6.
Scope and Limitation of the StudyThe scope of this research is enhancing theexisting content based relevancy based web page re-ranking algorithm anddeveloping interactive Amharic voice-based browser Plug-in by integrating with the algorithm as an assistive tool thatfacilitates the visually impaired in seeking information via Internet, whichcomprises of recognizing voice in Amharic, process the recognized token orcommands, searching from Google search engine and re-rank search results by there-ranking application, get questions and answer as artificial intelligence andfinally read the search result for impaired usersSince Amharic language by its nature iscomplicated language while spelling words we will consider accuracy oftranslation as a limitation and the Plug-in will be developed for the Googlechrome browser. This research will contribute new way of web page re-rankingfor visually impaired internet users with Amharic voice search.1.7. Significance of the StudyPageranking algorithm has a potential to extract useful pages or documents fromhuge collection of data from the web by fulfilling the need of web users. The beneficiaries of this research are visual impaired and low vision internet users.
Thisresearch enable the visually impaired users to access the internet by theirvoice and get relevant information from the internet by using the proposed pagere-ranking algorithm and it will fill the gap between visual impaired andnormal internet users in digital learning environment by getting relevantinformation with new way of re-ranking search results from search engines.Local visual impaired Amharic voice speakers will access the internet throughinteractive Amharic voice recognizer plugin to surf the internet.The proposed webpage re-ranking algorithm can be used by other researches for furtherresearching with other language across the world and more for multi lingualvoice based searching.
1.8. Organization of paperTherest of the study is organized as follows: Chapter two covered relatedliteratures on the basic concepts of page ranking algorithm: different types ofpage ranking algorithms, comparison of existing algorithms using differentmetrics , advantages and disadvantages of existing algorithms ,issues ofwebpage accessibility and literature review on existing speech recognitiontechnology’s related works done on page ranking algorithms. Chapter threedescribes the research methodology and strategy that aims to identify thepotential problem the visually impaired users face to access the internet. Itbegins by describing exploratory research approaches. Primary data collectionand analysis technique were described as information acquisition method.
Validity and reliability requirements also identified. Chapter four presentsthe results of the interview, techniques and technologies required to designthe prototype. In Chapter five, addresses backgrounds of architectural design ofthe proposed algorithm and implementation of the Amharic voice enabled pluginand integration of the algorithm with the voice enabled plug-in. Finally, inchapter six conclusions about the research and suggestions for future researchdirection were presented