Baden Hughes, University of Melbourne

The UQ Flint Archive houses the field notes and elicitation recordings made by Elwyn Flint in the 1950’s and 1960’s during extensive linguistic survey work across Queensland, Australia. The linguistic fieldwork documents 54 Australian Aboriginal languages, of which approximately half are now extinct. The field notes amount to approximately 900 separate documents including elicitation lists, phonological sketches, grammatical notes and transcriptions and is all in handwritten paper format. The corresponding audio recordings, originally made on reel to reel tape, have been converted to more modern formats, and comprise a collection of CDROM media.

The primary aim of the digitization project is to provide a web-based portal where Aboriginal languages can be explored, and through which new research can be facilitated. Work on the digitization of the UQ Flint Archive has been carried out since 1996, using various technologies and approaches. In the last 12 months, significant progress has been made in the analysis of the technical requirements and an overall strategy for the completion of the project.

The process of digitizing the contents of the UQ Flint Archive provides a number of interesting challenges in the context of EMELD. Firstly, all of the linguistic data is for endangered/extinct languages, and as such forms a valuable ethnographic repository. Secondly, the physical format of the data is itself in danger of decline, and as such digitization is an important preservation task in the short to medium term. Thirdly, the adoption of open standards for the encoding and presentation of text and audio data for linguistic field data, whilst enabling preservation, represents a new field of research in itself where best practice has yet to be formalised. Fourthly, the provision of this linguistic data online as a new data source for future research introduces concerns of data portability and longevity.

This paper will outline the origins of the data model, the content creation components, presentation forms based on the data model, data capture tools and media conversion components. It will also address some of the larger questions regarding the digitization and annotation of linguistic field work based on experience gained through work with the Flint Archive contents.