Call for Participation
Unstructured Information Management Architecture (UIMA)
2nd UIMA@GSCL Workshop
October 1, 2009
Program
09:00 - 10:00 - UIMA Tutorial, Graham Wilcock
10:00 - 10:30 - Coffee Break
10:30 - 10:45 - Opening
10:45 - 11:15 -
ClearTK: A Framework for Statistical Natural Language Processing Philip V. Ogren, Philipp G. Wetzler, and Steven J. Bethard
11:15 - 11:45 -
Multimedia Feature Extraction in the SAPIR Project Aaron Kaplan, Jonathan Mamou, Francesco Gallo, and Benjamin Sznajder
11:45 - 12:15 -
TextMarker: A Tool for Rule-Based Information Extraction Peter Kluegl, Martin Atzmueller, and Frank Puppe
12:15 - 13:00 - Lunch Break
13:00 - 13:30 -
LuCas – A Lucence CAS Indexer Erik Faessler, Rico Landefeld, Katrin Tomanek, and Udo Hahn
13:30 - 14:00 -
Abstracting the types away from a UIMA type system Karin Verspoor, William Baumgartner Jr., Christophe Roeder, and Lawrence Hunter
14:00 - 14:30 - Poster Session
14:30 - 15:00 - Round Table
PostersAnnotation Interchange with XSLTGraham Wilcock
Simplifying UIMA Component Development and Testing with Java Annotations and Dependency InjectionChristophe Roeder, Philip V. Ogren, William A. Baumgartner Jr., and Lawrence Hunter
UIMA-based Focused CrawlingDaniel Trümper, Matthias Wendt, and Christian Herta
On the Workshop
For many decades, NLP has suffered from low software engineering standards causing a limited degree of re-usability of code and interoperability of different modules within larger NLP systems. While this did not really hamper success in limited task areas (such as implementing a parser), it caused serious problems for the emerging field of language technology where the focus is on building complex integrated software systems, e.g., for information extraction or machine translation. This lack of integration has led to duplicated software development, work-arounds for programs written in different (versions of) programming languages, and ad-hoc tweaking of interfaces between modules developed at different sites.
In recent years, the Unstructured Information Management Architecture (UIMA) framework has been proposed as a middleware platform which offers integration by design through common type systems and standardized communication methods for components analysing streams of unstructured information, such as natural language. The UIMA framework offers a solid processing infrastructure that allows developers to concentrate on the implementation of the actual analytics components. An increasing number of members of the NLP community thus have adopted UIMA as a platform facilitating the creation of reusable NLP components that can be assembled to address different NLP tasks depending on their order, combination and configuration.
This workshop aims at bringing together members of the NLP community that are users, developers or providers of either UIMA components or UIMA-related tools in order to explore and discuss the opportunities and challenges in using UIMA as a platform for modern, well-engineered NLP. In the context of an emerging NLP-oriented UIMA community, the challenge to create not only reusable, but also interoperable components raises particular interest. From a methodological perspective, interoperability relies largely on UIMA type systems. Technically, it includes issues related to the packaging and distribution of UIMA components. Also, tools are important, for example to assemble complex processing work flows, to manage the bodies of data that are to be analysed and to visualize, explore, and further deploy the analysis results. Finally, interoperability is also affected by legal issues, such as potentially incompatible licenses of components and tools.
The availability of ready-to-use components plays a major role in choosing UIMA over other alternatives. To accentuate this, the workshop puts a focus on UIMA-based components and tools that are freely available for research.
On the Tutorial:
The Unstructured Information Management Architecture (UIMA)
Graham Wilcock (University of Helsinki)
Apache UIMA (Unstructured Information Management Architecture) is a framework for large-scale annotation and analysis of texts and other modes of unstructured information. UIMA originated at IBM but is now an open source Apache Software Foundation project (http://incubator.apache.org/uima) with an active community of users and developers. Its support for standards, interoperability and scalability makes UIMA a potentially attractive framework for NLP research and for development of applications. This introductory tutorial requires no previous knowledge of UIMA.
In UIMA, annotators run in analysis engines. Pipelines of annotators run in aggregate analysis engines. New annotators are normally written in Java, as the main UIMA framework is Java. Existing tools can be used in the UIMA framework by means of Java wrappers. There is also a C++ version of UIMA that supports annotators in C++ and Python.
OpenNLP is a well-known open source Java NLP toolkit for text annotation. The tutorial compares using the OpenNLP tools without UIMA and using them with UIMA, and clarifies the difference between a toolkit and a framework. UIMA is also compared with GATE, the existing widely-used open source Java framework for text annotation.
Organizers and Contact
Please address any inquiries regarding the workshop to:
uima.gscl2009@googlemail.comProgram Committee
- Anni R. Coden, IBM T.J. Watson Research Center, USA
- Branimir K. Boguraev, IBM T.J. Watson Research Center, USA
- Dietmar Rösner, Universität Magdeburg, Germany
- Graham Wilcock, University of Helsinki, Finland
- Iryna Gurevych, Technische Universität Darmstadt, Germany
- Katrin Tomanek, Friedrich-Schiller-Universität Jena, Germany
- Leo Ferres, University of Concepcion, Chile
- Michael Tanenblatt, IBM T.J. Watson Research Center, USA
- Nicolas Hernandez, Université de Nantes, France
- Philipp Cimiano, Delft University of Technology, Netherlands
- Richard Eckart de Castilho, Technische Universität Darmstadt, Germany
- Sophia Ananiadou, University of Manchester, Great Britain
- Stefan Geißler, TEMIS GmbH, Germany
- Udo Hahn, Friedrich-Schiller-Universität Jena, Germany
Links