Ontologías de dominio para dar soporte al proceso de creación de diccionarios monolingües
Date
2024-10-17
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Pontificia Universidad Católica del Perú
Abstract
Los diccionarios han sido, durante siglos, fundamentales para el entendimiento y la
preservación de las lenguas. Fueron también la última palabra para resolver muchas
discusiones, actuando como autoridades definitivas en cuestiones de significado y uso
correcto de las palabras. La importancia de los diccionarios radica en su capacidad para
proporcionar a los lectores una referencia confiable y precisa que facilita la
comunicación efectiva y el aprendizaje. Sin embargo, la creación de un diccionario es
una tarea ardua y meticulosa que puede llevar décadas en completarse. Este proceso
implica la recopilación exhaustiva de datos léxicos, el análisis detallado de palabras y
sus múltiples significados, y la verificación de su uso en diferentes contextos. Cada
nueva edición de un diccionario requiere un esfuerzo considerable para incorporar
cambios en el lenguaje, incluyendo la adición de nuevas palabras, la modificación de
definiciones existentes y la eliminación de términos obsoletos. Esta labor intensa
asegura que los diccionarios continúen siendo recursos valiosos y relevantes en un
mundo en constante evolución.
La lexicografía, la disciplina dedicada a la elaboración y estudio de diccionarios, enfrenta
numerosos desafíos en su práctica. Más allá del diseño de la estructura del diccionario,
el lexicógrafo debe lidiar con la complejidad de la lengua, donde uno de los mayores
retos es la polisemia. Las palabras con múltiples significados requieren un tratamiento
cuidadoso para asegurar que las definiciones sean precisas y relevantes para el
contexto en el que se utilizarán. Este proceso implica no solo identificar todos los
posibles significados de una palabra, sino también determinar cuál de estos es más
adecuado para el lector objetivo del diccionario. Además, el lexicógrafo debe asegurarse
de que las definiciones sean claras y comprensibles, evitando ambigüedades y
proporcionando ejemplos de uso que ilustren adecuadamente cada significado. Este
desafío se amplifica en un entorno lingüístico dinámico donde el lenguaje evoluciona
constantemente, haciendo imprescindible la utilización de herramientas avanzadas que
apoyen en la toma de decisiones y en la estructuración eficiente de las entradas del
diccionario.
Actualmente, tecnologías como TLex y Microsoft Word brindan herramientas que se
enfocan en la edición y presentación de las entradas del diccionario, orientadas a
detallar las definiciones seleccionadas y mejorar la calidad de su presentación. Sin
embargo, el lenguaje es dinámico y evoluciona constantemente, comportándose como
un ente vivo en constante evolución que refleja cambios sociales, culturales y
tecnológicos. Palabras nuevas emergen, otras caen en desuso y los significados pueden
transformarse con el tiempo. Este comportamiento evolutivo del lenguaje presenta un
desafío significativo para la lexicografía tradicional.
La propuesta de esta tesis es tratar el corpus lexicográfico como un ente orgánico que
evoluciona, integrando ontologías y folksonomías para gestionar y adaptar este
dinamismo. Las ontologías proporcionan una estructura jerárquica y organizada del
conocimiento, permitiendo representar de manera precisa las relaciones entre los
términos. Por otro lado, las folksonomías, que son sistemas de clasificación
colaborativa, permiten analizar el uso real del lenguaje de manera más flexible y
adaptativa. Al apoyarse en grafos de conocimiento, es posible realizar análisis
detallados y visualizaciones que ayudan a identificar tendencias, relaciones y cambios
en el uso del lenguaje. Este enfoque no solo facilita la actualización y mejora continua
de los diccionarios, sino que también permite ofrecer definiciones más precisas y
relevantes.
Herramientas como las nubes de palabras (wordclouds) pueden proporcionar al
lexicógrafo valiosa información sobre las definiciones y su composición léxica. Estas
herramientas visualizan la frecuencia de uso de las palabras, sugiriendo qué términos
son más comunes y, por ende, más fácilmente comprendidos por los lectores. Esto
permite al lexicógrafo identificar rápidamente cuáles definiciones están compuestas por
palabras de uso más frecuente, facilitando la creación de entradas más accesibles y
relevantes. De igual manera, si algunas palabras dentro de las definiciones pudieran
estar sujetas a algún tipo de censura, esta información puede ser comunicada al
lexicógrafo, permitiéndole tomar decisiones informadas sobre qué definiciones incluir en
el diccionario, acorde al público objetivo. Al integrar estas herramientas en el proceso
lexicográfico, se mejora la precisión y relevancia de las definiciones, asegurando que el
diccionario cumpla con las expectativas y necesidades de sus lectores.
Dictionaries have been fundamental for the understanding and preservation of languages for centuries. They have also been the final authority to resolve many discussions, acting as definitive authorities on questions of meaning and correct use of words. The importance of dictionaries lies in their ability to provide readers with a reliable and precise reference that facilitates effective communication and learning. However, creating a dictionary is an arduous and meticulous task that can take decades to complete. This process involves the exhaustive collection of lexical data, detailed analysis of words and their multiple meanings, and verification of their use in different contexts. Each new edition of a dictionary requires considerable effort to incorporate changes in language, including the addition of new words, modification of existing definitions, and elimination of obsolete terms. This intense labor ensures that dictionaries remain valuable and relevant resources in a constantly evolving world. Lexicography, the discipline dedicated to the creation and study of dictionaries, faces numerous challenges in its practice. Beyond designing the structure of the dictionary, the lexicographer must deal with the complexity of language, where one of the greatest challenges is polysemy. Words with multiple meanings require careful treatment to ensure that definitions are precise and relevant to the context in which they will be used. This process involves not only identifying all possible meanings of a word but also determining which of these is most appropriate for the target reader of the dictionary. Additionally, the lexicographer must ensure that definitions are clear and understandable, avoiding ambiguities and providing usage examples that adequately illustrate each meaning. This challenge is amplified in a dynamic linguistic environment where language constantly evolves, making it essential to use advanced tools that support decisionmaking and efficient structuring of dictionary entries. Currently, technologies like TLex and Microsoft Word provide tools that focus on the editing and presentation of dictionary entries, aimed at detailing selected definitions and improving the quality of their presentation. However, language is dynamic and evolves constantly, behaving like a living entity that reflects social, cultural, and technological changes. New words emerge, others fall into disuse, and meanings can change over time. This evolutionary behavior of language presents a significant challenge for traditional lexicography. The proposal of this thesis is to treat the lexicographic corpus as an evolving organic entity, integrating ontologies and folksonomies to manage and adapt to this dynamism. Ontologies provide a hierarchical and organized structure of knowledge, allowing precise representation of relationships between terms. On the other hand, folksonomies, which are collaborative classification systems, allow for more flexible and adaptive analysis of the actual use of language. By leveraging knowledge graphs, it is possible to perform detailed analyses and visualizations that help identify trends, relationships, and changes in language use. This approach not only facilitates the continuous updating and improvement of dictionaries but also enables the provision of more precise and relevant definitions. Tools like word clouds can provide lexicographers with valuable information about definitions and their lexical composition. These tools visualize the frequency of word usage, suggesting which terms are more common and therefore more easily understood by readers. This allows lexicographers to quickly identify which definitions are composed of more frequently used words, facilitating the creation of more accessible and relevant entries. Similarly, if some words within definitions could be subject to some form of censorship, this information can be communicated to the lexicographer, allowing them to make informed decisions about which definitions to include in the dictionary, according to the target audience. By integrating these tools into the lexicographic process, the precision and relevance of definitions are improved, ensuring that the dictionary meets the expectations and needs of its readers.
Dictionaries have been fundamental for the understanding and preservation of languages for centuries. They have also been the final authority to resolve many discussions, acting as definitive authorities on questions of meaning and correct use of words. The importance of dictionaries lies in their ability to provide readers with a reliable and precise reference that facilitates effective communication and learning. However, creating a dictionary is an arduous and meticulous task that can take decades to complete. This process involves the exhaustive collection of lexical data, detailed analysis of words and their multiple meanings, and verification of their use in different contexts. Each new edition of a dictionary requires considerable effort to incorporate changes in language, including the addition of new words, modification of existing definitions, and elimination of obsolete terms. This intense labor ensures that dictionaries remain valuable and relevant resources in a constantly evolving world. Lexicography, the discipline dedicated to the creation and study of dictionaries, faces numerous challenges in its practice. Beyond designing the structure of the dictionary, the lexicographer must deal with the complexity of language, where one of the greatest challenges is polysemy. Words with multiple meanings require careful treatment to ensure that definitions are precise and relevant to the context in which they will be used. This process involves not only identifying all possible meanings of a word but also determining which of these is most appropriate for the target reader of the dictionary. Additionally, the lexicographer must ensure that definitions are clear and understandable, avoiding ambiguities and providing usage examples that adequately illustrate each meaning. This challenge is amplified in a dynamic linguistic environment where language constantly evolves, making it essential to use advanced tools that support decisionmaking and efficient structuring of dictionary entries. Currently, technologies like TLex and Microsoft Word provide tools that focus on the editing and presentation of dictionary entries, aimed at detailing selected definitions and improving the quality of their presentation. However, language is dynamic and evolves constantly, behaving like a living entity that reflects social, cultural, and technological changes. New words emerge, others fall into disuse, and meanings can change over time. This evolutionary behavior of language presents a significant challenge for traditional lexicography. The proposal of this thesis is to treat the lexicographic corpus as an evolving organic entity, integrating ontologies and folksonomies to manage and adapt to this dynamism. Ontologies provide a hierarchical and organized structure of knowledge, allowing precise representation of relationships between terms. On the other hand, folksonomies, which are collaborative classification systems, allow for more flexible and adaptive analysis of the actual use of language. By leveraging knowledge graphs, it is possible to perform detailed analyses and visualizations that help identify trends, relationships, and changes in language use. This approach not only facilitates the continuous updating and improvement of dictionaries but also enables the provision of more precise and relevant definitions. Tools like word clouds can provide lexicographers with valuable information about definitions and their lexical composition. These tools visualize the frequency of word usage, suggesting which terms are more common and therefore more easily understood by readers. This allows lexicographers to quickly identify which definitions are composed of more frequently used words, facilitating the creation of more accessible and relevant entries. Similarly, if some words within definitions could be subject to some form of censorship, this information can be communicated to the lexicographer, allowing them to make informed decisions about which definitions to include in the dictionary, according to the target audience. By integrating these tools into the lexicographic process, the precision and relevance of definitions are improved, ensuring that the dictionary meets the expectations and needs of its readers.
Description
Keywords
Lexicografía, Diccionarios, Lingüística--Procesamiento de datos, Ontologías (Sistemas de recuperación de información), Clasificación
Citation
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as info:eu-repo/semantics/openAccess