Una metodología basado en conocimiento de grafos y relaciones conceptuales de palabras para el análisis de sentimientos
Fecha
Autores
Título de la revista
ISSN de la revista
Título del volumen
Editor
Pontificia Universidad Católica del Perú
Acceso al texto completo solo para la Comunidad PUCP
Resumen
El análisis de sentimientos ha encontrado aplicaciones en diferentes áreas como: psicología, filosofía, sociología, marketing, economía, educación, etc. En ese sentido, las redes
sociales se han convertido en una herramienta para que las personas expresen sus opiniones,
especialmente de forma textual. En los últimos años, la investigación basada en el conocimiento
de grafos ha surgido como un enfoque innovador y prometedor de la Inteligencia
Artificial (IA) para obtener una mejor representación estructurada de los datos. El presente
trabajo propone una metodología no supervisada basada en el conocimiento de grafos,
específicamente en la vectorización de nodos que representan palabras de las oraciones con
sus respectivas relaciones conceptuales. Parte de esta metodología se construyen diccionarios
de palabras clasificadas por polaridades (positiva, negativa y neutral) utilizando VADER
(Valence Aware Dictionary and sEntiment Reasoner), junto a conceptos basado en grafos
conceptuales de WordNet y ConceptNet. Esta metodología permite capturar las relaciones
de co-ocurrencia y relaciones conceptuales, junto con la polaridad de palabras. Así mismo,
se propone un algoritmo denominado Polarity-biased random Walk para construir caminos
del grafo utilizando un bias de polaridad. Luego mediante el algoritmo Skip-Gram se
realiza la vectorización de palabras que contiene los caminos obtenidas del grafo. Esta
metodología permitió encontrar resultados como, que a mayor profundidad de caminos y
número de caminos por nodo mediante un bias de 0.95 con ConceptNet o WordNet llegan
a ser mejor el resultado de clasificación de polaridad de sentimientos en comparación a
modelos como Node2vec, GraphSAGA, Graph Attention y Graph Convolution Networks.
Así mismo, embeddings construido a partir de IMDB movie permite mejorar los resultados
de precisión para aplicar en dominios espec´ıficos en comparación a modelos como
Word2Vec, FastText, Glove y Bert, este último con resultados muy cercanos a las propuestas.
Sentiment analysis has found applications in various fields such as psychology, philosophy, sociology, marketing, economics, education, etc. In this sense, social media has become a tool for people to express their opinions, especially in written form. In recent years, research based on graph knowledge has emerged as an innovative and promising approach of Artificial Intelligence (AI) to achieve a better structured representation of data. This work proposes an unsupervised methodology based on graph knowledge, specifically in the vectorization of nodes representing words in sentences along with their respective conceptual relationships. Part of this methodology involves building dictionaries of words classified by polarity (positive, negative, and neutral) using VADER (Valence Aware Dictionary and sEntiment Reasoner), along with concepts based on conceptual graphs from WordNet and ConceptNet. This methodology captures co-occurrence relationships and conceptual relationships, along with the polarity of words. An algorithm called Polarity-biased random Walk is also proposed to construct graph paths using a polarity bias. Then, using the Skip-Gram algorithm, the vectorization of words containing the paths obtained from the graph is performed. This methodology allowed for findings such that, with greater path depth and number of paths per node using a bias of 0.95 with ConceptNet or WordNet, the results for sentiment polarity classification improved compared to models like Node2vec, GraphSAGE, Graph Attention, and Graph Convolution Networks. Additionally, embeddings constructed from the IMDB movie dataset improve accuracy results for application in specific domains compared to models like Word2Vec, FastText, Glove, and Bert, with the latter showing results very close to the proposed methods.
Sentiment analysis has found applications in various fields such as psychology, philosophy, sociology, marketing, economics, education, etc. In this sense, social media has become a tool for people to express their opinions, especially in written form. In recent years, research based on graph knowledge has emerged as an innovative and promising approach of Artificial Intelligence (AI) to achieve a better structured representation of data. This work proposes an unsupervised methodology based on graph knowledge, specifically in the vectorization of nodes representing words in sentences along with their respective conceptual relationships. Part of this methodology involves building dictionaries of words classified by polarity (positive, negative, and neutral) using VADER (Valence Aware Dictionary and sEntiment Reasoner), along with concepts based on conceptual graphs from WordNet and ConceptNet. This methodology captures co-occurrence relationships and conceptual relationships, along with the polarity of words. An algorithm called Polarity-biased random Walk is also proposed to construct graph paths using a polarity bias. Then, using the Skip-Gram algorithm, the vectorization of words containing the paths obtained from the graph is performed. This methodology allowed for findings such that, with greater path depth and number of paths per node using a bias of 0.95 with ConceptNet or WordNet, the results for sentiment polarity classification improved compared to models like Node2vec, GraphSAGE, Graph Attention, and Graph Convolution Networks. Additionally, embeddings constructed from the IMDB movie dataset improve accuracy results for application in specific domains compared to models like Word2Vec, FastText, Glove, and Bert, with the latter showing results very close to the proposed methods.
Descripción
Palabras clave
Análisis de sentimientos, Teoría de grafos, Procesamiento de lenguaje natural (Computación), Aprendizaje automático (Inteligencia artificial)
Citación
Colecciones
item.page.endorsement
item.page.review
item.page.supplemented
item.page.referenced
Licencia Creative Commons
Excepto donde se indique lo contrario, la licencia de este ítem se describe como info:eu-repo/semantics/openAccess
