Ontology Learning from Taxonomic Text for Agricultural Knowledge Management

Loading...
Thumbnail Image
Date
2019
Journal Title
Journal ISSN
Volume Title
Publisher
ICAR-INDIAN AGRICULTURAL STATISTICS RESEARCH INSTITUTE ICAR-INDIAN AGRICULTURAL RESEARCH INSTITUTE NEW DELHI
Abstract
Ontology Learning from Taxonomic text is a novel approach of learning Ontology from semi structured taxonomic text. The traditional ontology learning within the available literature mostly focuses on the ontology learning from the huge corpus of text. The present study mainly concentrates on the ontology learning from the specialized kind of text i.e. the taxonomic text. This study dealt with the exploitation of the typical characteristics of the taxonomic text. The study eventually is subdivided into mainly four broad areas. First, it has developed a text corpus from the taxonomic text of USDA soil taxonomy and enhanced the corpus by the automated scraping of the Wikipedia, with the help of seed word given by the domain experts. After the development of the corpus, the keyword extraction was a challenging task for this research. A heuristic methodology has been developed which is used for the extraction of the keyword. The heuristic method is based on the RAKE guided by the W2V methodology. Second part of the study dealt with the taxonomy induction from the text which contains the core taxonomy. To segregate the core taxonomy part, we have used machine learning techniques for the classification of text. Third part of the study dealt with the taxonomy induction from the non taxonomic part of the text. We have used the hierarchical clustering to induct the taxonomy from the text. Fourth and last part of the work is dealt with the finding of the connections between the taxonomic and non taxonomic class that has been inducted by the second and third part of the work. Several empirical results have been provided and validated using suitable tools and techniques in USDA Soil Taxonomy. Total study involved a wide range of technologies and software. Most of the algorithms are implemented in Python programming language. Some of the experiment involves Java and SQL server. We have also used protégé for the study of existing manually developed ontology. Keywords: Ontology Learning; taxonomic text; USDA soil taxonomy; RAKE; W2V
Description
T-10262
Keywords
null
Citation
Collections