Maciej Gawinecki
Research in abbreviation expansion
in schemata of structural and semi-structural data


Lexical annotation of schema elements can improve effectiveness of schema matching [SB 2008]. However, it cannot be applied for those schema elements, that contain abbreviations. In this work we address this problem by providing a new technique for abbreviation expansion in context of schema of structured and semi-structured data.
[SB 2009] Sonia Bergamaschi, Laura Po & Serena Sorrentino (2008), "Automatic annotation for mapping discovery in data integration systems", In SEBD., pp. 334-341. [Abstract] [BibTeX]
Abstract: Lexical annotation is the explicit inclusion of the "meaning" of a data source element according to a lexical resource. Accuracy of semi-automatic lexical annotator tools is poor on real-world schemata due to the abundance of non-dictionary compound nouns. It follows that a large set of relationships among different schemata is discovered, including a great amount of false positive relationships. In this paper we propose a new method for the annotation of non-dictionary compound nouns, which draws its inspiration from works in the natural language disambiguation area. The method extends the lexical annotation module of the MOMIS data integration system.
BibTeX:
@inproceedings{DBLP:conf/sebd/BergamaschiPS08,
  author = {Sonia Bergamaschi and Laura Po and Serena Sorrentino},
  title = {Automatic annotation for mapping discovery in data integration systems},
  booktitle = {SEBD},
  year = {2008},
  pages = {334-341}
}

Publications

Sonia Bergamaschi, Serena Sorrentino (2009), Maciej Gawinecki & Laura Po "Schema Normalization for Improving Schema Matching", In 28th International Conference on Conceptual Modeling (ER 2009). [BibTeX]
BibTeX:
@inproceedings{Bergamaschi2009,
  author = {Sonia Bergamaschi and Serena Sorrentino and Maciej Gawinecki and Laura Po},
  title = {Schema Normalization for Improving Schema Matching},
  booktitle = {28th International Conference on Conceptual Modeling (ER 2009)},
  year = {2009},
  note = {(to appear)}
}
Maciej Gawinecki (2009), "Abbreviation Expansion In Lexical Annotation of Schema", In 1st International Workshop on Interoperability through Semantic Data and Service Integration. [BibTeX] [URL]
BibTeX:
@inproceedings{Gawinecki2009,
  author = {Maciej Gawinecki},
  title = {Abbreviation Expansion In Lexical Annotation of Schema},
  booktitle = {1st International Workshop on Interoperability through Semantic Data and Service Integration},
  year = {2009},
  note = {(to appear)}
}
Maciej Gawinecki (2009), "On Selecting Online Abbreviation Dictionary". University of Modena and Reggio-Emilia, Technical Report XXX, 2009. [BibTeX] [URL]
BibTeX:
@techreport{tr/Gawinecki2008a,
  author = {Maciej Gawinecki},
  title = {On Selecting Online Abbreviation Dictionary},
  year = {2009},
  number = {XXX},
  url = {http://mars.ing.unimo.it/wiki/images/1/1a/TR-AbbrDictSurvey.pdf}
}

Downloads

Our approach uses a number of sources:
The method has been tested on the following dataset:
Slides:

Acknowledgements

We are using Web service API of the great Abbreviations.com online dictionary. The solution being implemented is a part of MOMIS data integration system.

Evaluation results

Results of evaluation of proposed method for abbreviation expansion method: Y -- correctly expanded, N -- incorrectly expanded. You may see detailed results by switching the sheet (see under the table).


last change: 2009-05-08 edited by: Maciej Gawinecki