Automatic Text Simplification and Summarization

  • Fechas: del 18 al 22 de enero de 2021
  • Horario: de 10:00 a 12:30 h.
  • Organizador: Paloma Martínez y Lourdes Moreno, Departamento de Informática de la UC3M
  • Lugar: acceso al seminario on-line  

Ponente: Horacio Saggion


Text Simplification: Automatic text simplification as an NLP task arose from the necessity to make electronic textual content equally accessible to everyone. Automatic text simplification is a complex task which encompasses a number of operations applied to a text at different linguistic levels. The aim is to turn a complex text into a simplified variant, taking into consideration the specific needs of a particular target user. Automatic text simplification has traditionally had a double purpose. It can serve as preprocessing tool for other NLP applications and it can be used for a social function, making content accessible to different users such as foreign language learners, readers with aphasia, low literacy individuals, etc.  The first attempts to text simplification were rule-based syntactic simplification systems however nowadays with the availability of large parallel corpora, such as the English Wikipedia and the Simple English Wikipedia, approaches to automatic text simplification have become more data-driven.  Text simplification is a very active research topic where progress is still needed. In this seminar I will provide the audience with a panorama of more than a decade of work in the area emphasizing also the relevant social function that content simplification can make to the information society.

Text Summarization: A summary is a text with a very specific purpose: to give the reader a concise idea of the contents of another text. The idea of automatically producing summaries has a long story in the field of natural language processing, however, nowadays with the ever growing amount of texts and messages available on-line in public or private networks, this research field has become, more than ever before, key for the information society. The generation by computers of summaries or abstracts has been addressed from different angles starting with seminal work in the late fifties.  The applied techniques were first focused on the generation of sentence extracts and several methods grounded on statistical techniques were proposed to assess the relevance of sentences in a document. In the eighties, Artificial Intelligence symbolic techniques which considered summarization as an example of text understanding focused on the production of abstracts.  Hybrid techniques combining symbolic and statistical approaches sometimes relying on machine learning become popular with a renewed interest in summarization in the late nineties. Nowadays, with the availability of huge volumes of texts for training machine learning systems, several methods have emerged in the area of deep learning. In particular, neural networks perform today at the state of the art. Offering a historical perspective, I will go through relevant solutions in the area of text summarization, emphasizing the role of current machine learning systems. Likewise, I will describe the evaluation methods, challenges, and resources available for system development.


Lugar de Celebración: Escuela Politécnica Superior (EPS)
Universidad Carlos III de Madrid
Avd. Universidad, 30, 28911 Leganés, Madrid
Cómo llegar

Ponente: Karin Verpoor, Phd., Bioinformatics, University of Melbourne

Seminario "Natural Language Processing (NLP) for structuring complex biomedical texts: progress and remaining challenges"


The BioNLP community has been focused on methods for identifying and extracting key concepts and relations from highly specialised and terminology-rich texts; these texts have posed a challenge to general NLP tools as well as providing an opportunity to explore the robustness of relation extraction methods to domain-specific applications. In this talk I will present our recent studies with graph kernels and neural methods for relation extraction from the biomedical literature, present empirical work on core supporting tasks such as syntactic analysis of these texts, and discuss open challenges for work in this direction and beyond.

  • Fecha: 4 de Abril de 2019, 12:00
  • Lugar de celebración: 3.S1.08 (Campus de Leganés)

Lugar de Celebración: Escuela Politécnica Superior (EPS)
Universidad Carlos III de Madrid
Avd. Universidad, 30, 28911 Leganés, Madrid
Cómo llegar

Ponente: Riza Batista-Navarro, PhD, National Centre for Text Mining (NaCTeM),University of Manchester, UK

Seminario "Clinical and Biomedical Natural Language Processing"

Abstract: Seminario en el que se explicarán los principales retos y técnicas empleadas en la minería de textos aplicada a documentación biomédica y clínica. Se ofrecerá una panorámica de los recursos como corpora y anotaciones y técnicas para reconocimiento de entidades, resolución de coreferencias, extracción de relaciones entre conceptos entre otros y se mostrarán distintos entornos de trabajo para mineria así como aplicaciones a búsqueda semántica de información científica y extracción de información para data curation.

  • Fecha:31 de Mayo al 2 de Junio, 15:00-19:00
  • Lugar de celebración: 31 mayo y 1 de junio: 3.1.S08 Biblioteca Rey Pastor (Campus de Leganés) y 2 de junio: 2.2.C03 Edificio Sabatini (Campus de Leganés)

Lugar de Celebración: Escuela Politécnica Superior (EPS)
Universidad Carlos III de Madrid
Avd. Universidad, 30, 28911 Leganés, Madrid

Ponente: Julián Moreno Schneider, Schneider, DFKI (Germany)

Seminario "Digital curation technologies and its applications"

Abstract: The project "Digitale Kuratierung Technologien" (DKT), technologies for digital curation, is a project funded by the German Ministry of Science and Research in which five partners are involved, one of which is a research institution (DFKI) and the other four are industrial partners in different sectors: journalism, graphic design, etc (Condat, 3PC, Kreuzwerker, Art+Com). Digital Curation is a concept that is becoming very popular in research, although some areas of research have already referred to it in other terms such as Information Forensics or Information Analytics. The main purpose of information curation is the analysis and extraction of information through automatic processing tools to make the users better understand the existing content in, for example, a collection of documents. The main objective of the project is to create a platform for curating information that can help knowledge workers become familiar with large amounts of information in a fast and efficient way. To achieve this, the platform receives the information, usually in documents or collections, and processes them through a sequence of natural language analysis processes. In order to make this process scalable and interconnectable, we use a popular format in the area of ​​semantic analysis (RDF and NIF). Once the information has been processed, the partners use this information fort he different use cases they are responsible for: museum exhibits, journalism, etc. An overview of technologies used in digital curation as well as challenges will be given in the seminar.

  • Fecha: 26 de enero 2017, 16:00-18:00
  • Lugar de celebración: Sala 2.1.C08

Ponente: Anders Sundnes Løvlie, Associate Professor, Gjøvik University College, Oslo, Norway.

CHARLA 1: "More openness, more control? The effects of terrorism on online debate forums"

Abstract: The 22 July 2011 terrorist attack in Oslo had a profound effect on Norwegian online debate. Due to the terrorist's online activities, public controversy was focused on the debate systems of online newspapers, which were perceived as giving a platform for racist and extremist speakers, cultivating a climate which was claimed to have contributed to the terrorist's motivation for the attack. On the counter-side of this argument are fundamental issues of freedom of expression and the democratising effect of online debate. This lecture gives an outline of the core issues in this debate, and the resulting increase in editorial control with online debate in Norway. This case has important implications for other countries hit by terrorist attacks in recent years, including Spain.

  • Fecha: 8 de mayo de 2013 de 10:00-11:00. Lugar: Sala 2.1.C08.

CHARLA 2: "Distributed, location-based games: Playing mobile games across borders"

Abstract: This lecture outlines a vision for a new kind of mobile game. The concept of location-based games (or pervasive games) is well known: Games that take place in the «real world» and are played on mobile devices, which calculate the player's position and use this to interact with other players in the same environment. This makes it possible to re-appropriate public spaces for play and exploration. But what if you could play this kind of game together with other people, located in other cities? Just like global protest movements like «Occupy» (or in Spain, «los indignados» and the 15M movement), this would make it possible for people located in different cities and different countries to engage in activities together across physical boundaries. Whether those activities are political or just playful, this would be a small contribution towards the old idea of a trans-national public sphere. As an example of this vision, this lecture describes the design of a planned game for the Occupy movement, called «Occupoly».

  • Fecha: 8 de mayo de 2013 de 11:00-12:00. Lugar: Sala 2.1.C08.

Workshop: "Innovation and mobile app development"

Abstract: The key to success in mobile app development is not having the best programming skills, the biggest company or the most resources – it is having the best ideas. But how do you produce good ideas? This is not a matter of being a genius with unique inspiration, but of working hard and methodically. This practical workshop will teach the participants some simple techniques for producing basic ideas and refining them critically into basic concepts for further development.

  • Fecha: 9 de mayo de 2013 de 10:00-13:00. Lugar: Sala 2.1.C08.