|Full Title:||Encoding Language and Linguistic Information in Historical Corpora|
|Start Date:||08-Mar-2017 - 10-Mar-2017|
|Meeting Email:||click here to access email|
|Meeting Description:||Historical corpora have been established as an empirical digital base for various types of linguistic studies. The corpora are based on texts (sometimes images) and often require special information encodings, e.g. transcription and normalization. With respect to corpus linguistics as a method, annotating a (historical) corpus is always a matter of interpretation, either of its structure or of its content, and need not be universally consensual. Additionally, annotations have to balance between a diplomatic representation of historical texts and its linguistic analysis. This requires a linguistic modelling of annotations to develop (i) annotation guidelines, standardized and customized ones, (ii) annotation concepts, such as spans, trees or graphs, (iii) annotation assignment methods, and (iv) corpus architectures. This working group would like to ask which methods of annotation have proven successful in order to address the balancing of historical diplomatic representation and linguistic analyses in historical, corpus-linguistic studies. Additionally, we would like to learn from cases, where common linguistic annotations are not sufficient for the structured exploration of the historical corpus data, and where new approaches address these requirements.
This workshop would like to bring together linguists interested in and using historical corpora, corpus linguists, and computational linguists.
|Linguistic Subfield:||Computational Linguistics; Historical Linguistics; Text/Corpus Linguistics|
| This is a session of the following meeting:
39th Annual Meeting of the DGfS (Deutsche Gesellschaft für Sprachwissenschaft)
|Calls and Conferences main page|