SOTW: Semantics Oriented Tagging of Web Pages
Akshith Gunasheelan1* and Gerard Deepak2
1Compute Cloud Services Engineering, Hewlett Packard Enterprise
2Department of Computer Science and Engineering, Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Manipal, India
*Corresponding Author: Akshith Gunasheelan, Compute Cloud Services Engineering, Hewlett Packard Enterprise.
Published: September 11, 2024
Abstract  
There is a need for a strategic semantically inclined model for tagging web pages in the era of the latest Web 3.0. This paper proposes a strategic semantic oriented knowledge driven learning infused model for tagging of web pages which encompasses extraction of terms and categories of web pages and application of models like TF-IDF and Structural Topical Modelling (STM) which is in-turn followed by generating the RDF instances and enriching the entities that come out of this pipeline through the Wikidata API. The proposed framework also uses a Logistic Regression Classifier Unit (LRU) which encompasses the RDF subject and object instances as the features to classify the web page dataset. A dynamic knowledge stack generation via a semantic agent and classification of the dynamically generated knowledge stack using a strong deep learning CNN classifier helps in increasing the overall learning capability of the model. Semantics oriented reasoning is achieved using the Adaptive Pointwise Mutual Information (APMI) measure with differential step deviance measures and the shuffled frog-leap algorithm takes care of the meta-heuristic optimization by improving the intermediate optimization results and overall precision of 95.18%, with a False Discovery Rate (FDR) of 0.05 and an F-measure of 96.30% which makes it the best in class model when compared to the other baseline models for semantics-oriented learning through webpage tagging.
Keywords: Normalized Pointwise Mutual Information (NPMI); Pointwise Mutual Information (PMI); Adaptive Pointwise Mutual Information (APMI); Logical Regression Unit (LRU)
.