Best Practice in a Nutshell
Here is a quick overview of best practice recommendations for archiving linguistic data:
-
Make an archival copy in a format which offers LOTS (i.e., it is Lossless, Open Standard, Transparent, and Supported by multiple vendors).
- For textual material, use txt format with XML markup, and encode the characters in Unicode.
- For images, use TIFF grayscale format.
- For audio, use WAV format.
- For video, minimize compression: see Classroom: Video
- To ensure long-term intelligibility, link terminology to a common ontology, e.g. GOLD
- For language identification, use Ethnologue / OLAC language codes.
- Create metadata for the resource in a standard format, e.g. OLAC.
- Make the metadata available to a general search engine, e.g. the OLAC harvester (even if the resource itself is not available online).
-
Place archival copies in a stable online linguistic archive that will:
- Maintain a constant URL.
- Migrate data to new formats.
- Maintain clear documentation of rights, terms of use, and access restrictions.
For more information, see What are Best Practices? and Why Follow Best Practice?
| Related Links | |
|---|---|
|
BP in a Nutshell What are Best Practices? Why Follow BP? Endangered Languages Endangered Data Community Start Page Linguist Start Page Archivist Start Page |
|
| User Contributed Notes Best Practice in a Nutshell |
+ Add a comment |
| + View comments |

