Deborah Anderson University of California, Berkeley

Endangered Languages in Unicode, Software, Fonts and Keyboards

For linguists working on endangered languages, preserving the language is critical. A key component of the linguists' methodology should be to ensure that the texts and recordings of given language also survive, preferably in a standardized, stable format. For text, this means using the international character encoding standard Unicode. Using Unicode is fairly straight-forward for languages that use common letters from the Latin alphabet which are already included in widely available fonts. However, many languages may use combinations of Latin letters with IPA symbols that are not commonly available in fonts. Additionally, there are still over 40 languages whose script is still unencoded in Unicode. How can one then preserve text, if (a) the script or characters are missing from Unicode, or (b) there are no fonts or keyboards to use?

This talk will give a general overview of the entire process from script encoding to fonts/keyboards: how to get endangered languages' characters and scripts into Unicode, how to submit locale data so localized software interfaces can be created, how to inform computer companies of a new script's needs in their rendering software, and a sketch of options available for creating Unicode-enabled fonts and keyboards.