Information extraction with automatic knowledge expansion
SCIE
SCOPUS
- Title
- Information extraction with automatic knowledge expansion
- Authors
- Jung, H; Yi, E; Kim, D; Lee, GG
- Date Issued
- 2005-03
- Publisher
- PERGAMON-ELSEVIER SCIENCE LTD
- Abstract
- POSIE (POSTECH Information Extraction System) is an information extraction system which uses multiple learning strategies, i.e., SmL, user-oriented learning, and separate-context learning, in a question answering framework. POSIE replaces laborious annotation with automatic instance extraction by the SmL from structured Web documents, and places the user at the end of the user-oriented learning cycle. Information extraction as question answering simplifies the extraction procedures for a set of slots. We introduce the techniques verified on the question answering framework, such as domain knowledge and instance rules, into an information extraction problem. To incrementally improve extraction performance, a sequence of the user-oriented learning and the separate-context learning produces context rules and generalizes them in both the learning and extraction phases. Experiments on the "continuing education" domain initially show that the F1-measure becomes 0.477 and recall 0.748 with no user training. However, as the size of the training documents grows, the F1-measure reaches beyond 0.75 with recall 0.772. We also obtain F-measure of about 0.9 for five out of seven slots on "job offering" domain. (C) 2003 Elsevier Ltd. All rights reserved.
- Keywords
- information extraction; question answering; user-oriented learning; lexico-semantic pattern; machine learning
- URI
- https://oasis.postech.ac.kr/handle/2014.oak/24911
- DOI
- 10.1016/S0306-4573(03)00066-9
- ISSN
- 0306-4573
- Article Type
- Article
- Citation
- INFORMATION PROCESSING & MANAGEMENT, vol. 41, no. 2, page. 217 - 242, 2005-03
- Files in This Item:
- There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.