Babelnet text classification

5/8/2023 0 Comments

Babelnet text classification

The popular word2vec embeddings (Mikolov et al. These embeddings try to maintain semantic relatedness between concepts, but also support meaningful algebraic operators between them. Reference Mikolov, Karafiát, Burget, Cernocký and Khudanpur2010 Reference Mikolov, Kombrink, Burget, Cernocký and Khudanpur2011). Deep neural models are used to learn semantically aware embeddings between words (Mikolov et al. ( Reference Mikolov, Kombrink, Burget, Cernocký and Khudanpur2011). Later approaches utilize convolutional deep networks, such as the unified multitask network in Collobert and Weston ( Reference Collobert and Weston2008), or introduce recurrent neural networks, as in Mikolov et al. ( Reference Bengio, Ducharme, Vincent and Jauvin2003) to jointly learn word embeddings and the probability function of the input word collection. Regarding neural approaches, a neural language model applied on word sequences is used in Bengio et al. In these cases, latent topics are inferred to form a new, efficient representation space for text. Reference Hingmire, Chougule, Palshikar and Chakraborti2013). Reference Deerwester, Dumais, Furnas, Landauer and Harshman1990) and Latent Dirichlet Allocation (Hingmire et al. Other approaches use topic modeling techniques, such as Latent Semantic Indexing (Deerwester et al. Early attempts produce shallow vector space features to represent text elements, such as words and documents, via histogram-based methods (Katz Reference Katz1987 Salton and Buckley Reference Salton and Buckley1988 Joachims Reference Joachims1998). There have been numerous approaches to learning text embeddings. This approach avoids the common problem of extreme feature sparsity and mitigates the curse of dimensionality that usually plagues shallow representations. Instead of applying a handcrafted rule, text embeddings learn a transformation of the elements in the input. These typically come in the form of text embeddings, which are vector space representations able to capture features beyond simple statistical properties, Such approaches try to evolve over the histogram-based accumulation used in methods like the bag-of-words model (Salton and Buckley Reference Salton and Buckley1988). Reference Hinton, McClelland and Rumelhart1984) for text data on machine learning tasks. In Natural Language Processing applications, this has been expressed via the success of distributed representations (Hinton et al. The rise of deep learning has been accompanied by a paradigm shift in machine learning and intelligent systems. We also note additional interesting findings produced by our approach regarding the behavior of term frequency - inverse document frequency normalization on semantic vectors, along with the radical dimensionality reduction potential with negligible performance loss. Experimental results over established datasets demonstrate that our approach of semantic augmentation in the input space boosts classification performance significantly, with concatenation offering the best performance. We enrich word2vec embeddings with the resulting semantic vector through concatenation or replacement and apply the semantically augmented word embeddings on the classification task via a DNN. Additionally, we consider a weight propagation mechanism that exploits semantic relationships in the concept graph and conveys a spreading activation component.

Concepts are selected via a variety of semantic disambiguation techniques, including a basic, a part-of-speech-based, and a semantic embedding projection method. We extract semantics for the words in the preprocessed text from the WordNet semantic graph, in the form of weighted concept terms that form a semantic frequency vector.

In this study, we focus on the text classification task, investigating methods for augmenting the input to deep neural networks (DNNs) with semantic information. These learners are often applied as black-box models that ignore or insufficiently utilize a wealth of preexisting semantic information. The recent breakthroughs in deep neural architectures across multiple machine learning fields have led to the widespread use of deep neural models.

0 Comments

YOUR CART

Babelnet text classification

Leave a Reply.

Author

Archives

Categories