César De Pablo Sánchez (Research lines)
During the last years, biomedicine has witnessed a huge development. Large amounts of experimental and computational biomedical data have been generated along with new discoveries, which are accompanied by an exponential increase in the number of biomedical publications describing these discoveries. Nowadays, biomedical professionals are not able to keep themselves up to date with, for example, all the publications related with adverse drug reactions.
The continuing growth and diversification of the scientific literature require tremendous systematic and automated efforts to utilize the underlying information
In the near future, tools for knowledge discovery will play a pivotal role in biomedical systems since the overwhelming amount of biomedical knowledge in texts demands automated methods to collect, maintain and interpret them.
Particularly, we are working on applying information recovery and information extracting techniques in biomedical texts, especially in detecting biomedical entities (as drugs, genes, proteins) and biomedical associations among entities (drug-drug interactions, etc.)
The Web is the main tool for being able to act as citizens in the Information Society in which we are immersed. Through it you access to multiple services, yet many of these services are not accessible to everyone. The accessibility barriers affect in a higher degree to people with disabilities, but there are many other user groups at risk of exclusion.
The equitable use of the Web is a right for all people. Although in many countries this right is regulated by law, the data indicate that there are many web sites and applications that are not accessible. There are important initiatives, at different levels, with the goal of designing a universal and accessible Web, but there are many obstacles in the path to obtain this goal.
As a proposed solution to this situation, from the engineering perspective, the methodological support AWA (Accessibility for Web Applications) has been proposed. AWA provides a workspace in order to include the accessibility requirement in the organizations devoted to web application development.
It is generally accepted in the information system (IS) field that IS quality is highly dependent on the decisions made early in the development life cycle. For this reason, the first phase in the database systems development methodology is crucial and it can affect to the information management, the adaptability to changes and the information systems integration. Our research is based on the study and extension proposal of conceptual models with the aim to integrate, validate, and refine conceptual schemes for several models. In these models an important component to represent the domain semantics are the integrity constraints which constitute a tool to design all requirements specified.
The database integrity means to guarantee two important components: correctness and completeness. This is data are valid and relevant in the application domain. Data models are not completed enough to achieve the database integrity because they don’t have mechanism to represent all semantics associated to a given domain. The lack to database integrity could deduce false facts. To covert the aforementioned gap, our research is focused on the definition of methods, techniques, and technologies which can facilitate rules development process in an easy way.
In contrast with traditinal Search Engines that returns documents, Question Answering (QA) systems return precise and high quality information nuggets to information needs posed as questions. QA systems are achieving a reasonable degree of performance for factual questions (Who?, Where?, What? ...) whose answer can be retrieved from a open domain information repository. Questions with temporal restrictions, definitions, opinions or explanations are among the new challenges for QA technology.
Other issues that information access systems should address as multimodality, multilinguality, user or task adaptation are also a matter of current research. Our group works in a QA platform for Spanish, that integrat different resources available. Nevertheless, our interest also lies in tools and techniques that allow the rapid development of QA systems for other languages, even if linguistic resources are limited. For that reason we are mainly interested in Machine Learning techniques, for instance, for named entity recognition. In this context, we have participated in many CLEF evaluation forums.
Internet and Web 2.0 technologies have led to an increase of information available in different forms. The Information Retrieval techniques are one of the ways to organize and improve access to this information. In recent years there have appeared some commercial software services that allow the extraction of keywords and named entities (NE). These services have been integrated into many applications and can be expected that with the advance of software as a service, they are relevant to improving interoperability and semantic capabilities in the near future. The recognition and classification of named entities (NERC) is a branch of the field of Information Retrieval, whose aim is to identify units of information in the text and classify them into predefined categories, such as people, organizations, places , etc. The LABDA group researches in different domain-adaptive algorithms and languages for entity recognition techniques using bootstrapping techniques when there are no specific resources such as dictionaries, analyzers, etc. Moreover, at present, many applications for Natural Language Processing (NLP) could have a substantial improvement in performance if they referred to the treatment of temporal dimension that has the information they handle. An example of this statement is given in the field of Information Retrieval (IR): the main commercial Web search engines do not perform an explicit analysis of the temporal information of the content, or achieve it in a superficial way, missing the underlying semantics and its potential for implementing advanced techniques in information ordering, selection and filter results, etc.. The temporal information allows locating the events of a text on a timeline obtaining a chronological order. A person may implicitly remove all temporary expressions of a text and the relationship they establish between events, interpreting the time point referred to that expression. However, when it comes to interpreting large amounts of information we face a task too costly to do it manually. If it is to automate this process, is needed provide for additional knowledge to systems that carried out the reasoning. We therefore work in the definition, development and evaluation of a proposal for automated processing of temporal information that is typically handled by applications to access to unstructured information. It also seeks to research for mechanisms of representation of the temporal semantics of the documents dealing with such applications in order to improve the recovery of relevant information. For their achievement it becomes necessary an analysis on the formulating level of the information needs of users, as well as in the information retrieval that they provide as response. This proposal will have additional temporal management skills regarding to traditional systems, and providing a new perspective of temporal management thus solving some of the existing problems.