The 26th UW/MS Symposium in Computational Linguistics
Location: UW, Mary Gates Hall (MGH), Room 241
Time: 3:30-5pm on February 17, 2012
Come take advantage of this opportunity to connect with the computational linguistics community at Microsoft and the University of Washington. This is a regular opportunity for computational linguists at the University of Washington and at Microsoft to discuss topics in the field and to connect in a friendly informal atmosphere. We will have two talks (see below), followed by informal mingling.
Speaker: Cosmin Adrian Bejan
Title: Using statistical feature selection to improve pneumonia identification
Abstract: The availability of comprehensive electronic medical records that include narrative reports provides an opportunity for natural language processing (NLP) technologies to play a major role in clinical research. One of the main advantages of employing these technologies is the automatic extraction of relevant clinical information to identify critical illness phenotypes and to facilitate clinical and translational studies of large cohorts of critically ill patients. In this talk, I will present an NLP system for the task of pneumonia identification.
Based on the information extracted from the narrative reports associated with a patient admitted in the intensive care unit, the task is to identify whether or not the patient is positive for pneumonia. I will show that, an approach using statistical feature selection, in which only a small subset of informative features from the initial feature space is considered, achieves significantly better results than a baseline, which uses all the features from the same feature space. The addition of a feature that extracts the assertion value of all pneumonia expressions from the clinical dataset considered further improves the performance of the NLP system for this task.
Cosmin Adrian Bejan is a senior fellow in the Division of Biomedical and Health Informatics at the University of Washington. Prior to his current Technologies at the University of Southern California. Cosmin received his M.S. and B.S. degrees in computer science from the University of Iasi, Romania. He holds a Ph.D. degree in computer science from the University of Texas at Dallas, where he investigated natural language processing and machine learning methodologies in order to capture the semantics of the event structures that are encoded in text. His research interests are in the areas of natural language processing, biomedical informatics, and machine learning with a focus on event semantics,
open-domain and clinical information extraction, and commonsense causal reasoning.
Speaker: Dominic Widdows
Title: Learning and Reasoning with Semantic Vectors
Abstract: Learning distributional similarities from text is well-established, and has been used in many tasks including information retrieval, word sense disambiguation, lexical acquisition and translation. Different mathematical techniques are available, including singular value decomposition, latent Dirichlet analysis, and random projection, with different semantic and computational properties that make them appropriate for different tasks. However, for many years these methods could be used only to detect symmetric “similarity” relationships: due to the lack of syntactic or directed relationships, they have often been described as “bag-of-words?” methods.
In the last few years, research has taken a big step further, in learning relationships beyond similarity. Linguistically, this is stimulated by taking word-order, or deeper syntactic and semantic relationships, into account. Mathematically, it is implemented by using more sophisticated operators, such as permutation or convolutions of coordinates, to represent language composition.
In this talk, we will briefly survey this area, and discuss a specific application of semantic vectors to inference in the medical domain. We use semantic vectors to learn representations for different objects and their relationships, and test this model by applying it to an analogical reasoning test, of the form “A is to X, as B is to what?”
Finding Schizophrenia’s Prozac: Emergent Relational Similarity in Predication Space Cohen. T. Widdows, D. Schvaneveldt, RW. Rindflesch, T. Proceedings of the Fifth International Symposium on Quantum Interaction, Aberdeen, UK, 2011.
According to results to date, these methods are both robust and computationally efficient. The combination of learning and reasoning within the same model is a challenge that applies to many disciplines, and is perhaps a key ingredient in building intelligent technology.