Large-Scale Experiments for Mathematical Document Classification

TitleLarge-Scale Experiments for Mathematical Document Classification
Publication TypeConference Paper
Year of Publication2013
AuthorsBarthel, S., S. Tönnies, and W. - T. Balke
Refereed DesignationRefereed
Conference Name15th International Conference on Asia-Pacific Digital Libraries (ICADL)
Date Published12/2013
Conference LocationBangalore, India

The ever increasing amount of digitally available information is curse and blessing at the same time. On the one hand, users have increasingly large amounts of information at their fingertips. On the other hand, the assessment and refinement of web search results becomes more and more tiresome and difficult for non-experts in a domain. Therefore, established digital libraries offer specialized collections with a certain degree of quality. This quality can largely be attributed to the great effort invested into semantic enrichment of the provided documents e.g. by annotating their documents with respect to a domain-specific taxonomy. This process is still done manually in many domains, e.g. chemistry (CAS), medicine (MeSH), or mathematics (MSC). But due to the growing amount of data, this manual task gets more and more time consuming and expensive. The only solution for this problem seems to employ automated classification algorithms, but from evaluations done in previous research, conclusions to a real world scenario are difficult to make. We therefore conducted a large scale feasibility study on a real world data set from one of the biggest mathematical digital libraries, i.e. Zentralblatt MATH, with special focus on its practical applicability.

icadl13.pdf346.56 KB