Linguist Start Page
Page Index
- What ethical considerations need to be made?
- What kinds of tools should I use?
- What should I collect when I collect data?
- How have other linguists done it?
What ethical considerations need to be made?
While E-MELD is not intended to be a resource about ethics, the issue is of great importance and should be a primary consideration. Please visit the ethics page for more information. Also, please browse the ethics section of the Reading Room for further resources.What kinds of tools should I use?
Before collecting data, it is a good idea to be aware of the various methods of recording and digitizing it. For advice on what factors should be considered in selecting software, as well as information about existing software, please visit the Classroom's Software page.Which tools are available?
For information on software and hardware tools available, visit the Toolroom.What software should I use for specific kinds of data?
- For advice about software considerations and options specific to audio data, please visit the Classroom's Digitization of Audio Files page.
- To learn about software options and considerations for images, please visit the Classroom's Digitization of Existing Images page.
- Information about video software can be found in the Classroom's Digitization of Video Files page.
- To learn about the standard for text input, visit the Classroom's Unicode page. To learn about how to store data in an enduring format, visit the Classroom's XML pages. To learn about how to use stylesheets to organize the layout of data, visit the Classroom's Stylesheet page.
What should I collect when I collect data?
Do I need to collect anything besides language data?
Along with data, it is imperative to collect metadata. To learn about metadata and how to compile it, visit the Classroom's Metadata page. To learn about and use the OLAC Repository Editor (ORE), a tool for metadata creation, visit the Workroom's Metadata page.How do I build a corpus?
Of course, a linguistic corpus is also necessary. To learn how to collect material and build a corpus, visit the Classroom's Archives page. To learn how to record this corpus in an enduring format, visit the Classroom's XML pages. To learn how to create a lexicon, visit the Workroom's Lexical Analysis and Output page.Should I add anything to the data?
Annotating a corpus assigns meaning to the data and enables future researchers to access your insights. To learn how to annotate a corpus, visit the Classroom's Annotation page. When annotating a corpus, attention should be paid to the terminology used. To learn about terminological mapping, visit the Workroom's Terminology page. To view the General Ontology for Linguistic Description (GOLD), visit the Ontology Tree.How have other linguists done it?
To read examples of data conversion from legacy format to best practices format, vist the Case Studies. To learn about other documentation projects, in the light of best practices, visit the Documentation Projects section of the Case Studies and the UNESCO Register of Good Practices in Language Preservation.
| Related Links | |
|---|---|
|
BP in a Nutshell What are Best Practices? Why Follow BP? Endangered Languages Endangered Data Community Start Page Linguist Start Page Archivist Start Page |
|
| User Contributed Notes Linguist Start Page |
+ Add a comment |
| + View comments |
02-Feb-2006

