Generación automática de resúmenes abstractivos mono documento utilizando análisis semántico y del discurso

Valderrama Vilca, Gregory Cesar

Generación automática de resúmenes abstractivos mono documento utilizando análisis semántico y del discurso

Ver principales metadatos en formato legible

dc.contributor.advisor	Sobrevilla Cabezudo, Marco Antonio
dc.contributor.author	Valderrama Vilca, Gregory Cesar	es_ES
dc.date.accessioned	2017-09-20T23:47:13Z	es_ES
dc.date.available	2017-09-20T23:47:13Z	es_ES
dc.date.created	2017	es_ES
dc.date.issued	2017-09-20	es_ES
dc.description.abstract	The web is a giant resource of data and information about security, health, education, and others, matters that have great utility for people, but to get a synthesis or abstract about one or many documents is an expensive labor, which with manual process might be impossible due to the huge amount of data. Abstract generation is a challenging task, due to that involves analysis and comprehension of the written text in non structural natural language dependent of a context and it must describe an events synthesis or knowledge in a simple form, becoming natural for any reader. There are diverse approaches to summarize. These categorized into extractive or abstractive. On abstractive technique, summaries are generated starting from selecting outstanding sentences on source text. Abstractive summaries are created by regenerating the content extracted from source text, through that phrases are reformulated by terms fusion, compression or suppression processes. In this manner, paraphrasing sentences are obtained or even sentences were not in the original text. This summarize type has a major probability to reach coherence and smoothness like one generated by human beings. The present work implements a method that allows to integrate syntactic, semantic (AMR annotator) and discursive (RST) information into a conceptual graph. This will be summarized through the use of a new measure of concept similarity on WordNet.To find the most relevant concepts we use PageRank, considering all discursive information given by the O”Donell method application. With the most important concepts and semantic roles information got from the PropBank, a natural language generation method was implemented with tool SimpleNLG. In this work we can appreciated the results of applying this method to the corpus of Document Understanding Conference 2002 and tested by Rouge metric, widely used in the automatic summarization task. Our method reaches a measure F1 of 24 % in Rouge-1 metric for the mono-document abstract generation task. This shows that using these techniques are workable and even more profitable and recommended configurations and useful tools for this task.	es_ES
dc.description.uri	Tesis	es_ES
dc.identifier.uri	http://hdl.handle.net/20.500.12404/9361
dc.language.iso	eng	es_ES
dc.publisher	Pontificia Universidad Católica del Perú	es_ES
dc.publisher.country	PE	es_ES
dc.rights	info:eu-repo/semantics/openAccess	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/2.5/pe/	*
dc.subject	Computación semántica	es_ES
dc.subject	Resúmenes	es_ES
dc.subject	Semántica	es_ES
dc.subject.ocde	https://purl.org/pe-repo/ocde/ford#1.02.00	es_ES
dc.title	Generación automática de resúmenes abstractivos mono documento utilizando análisis semántico y del discurso	es_ES
dc.type	info:eu-repo/semantics/masterThesis	es_ES
renati.discipline	611087	es_ES
renati.level	https://purl.org/pe-repo/renati/level#maestro	es_ES
renati.type	https://purl.org/pe-repo/renati/type#tesis	es_ES
thesis.degree.discipline	Informática con mención en Ciencias de la Computación	es_ES
thesis.degree.grantor	Pontificia Universidad Católica del Perú. Escuela de Posgrado	es_ES
thesis.degree.level	Maestría	es_ES
thesis.degree.name	Maestro en Informática con mención en Ciencias de la Computación	es_ES