ODIN – Advanced Search
Presented by:         
William Lewis, University of Washington/CSU Fresno  
Project / Software Title :      
ODIN – The Online Database of INterlinear text  
Project / Software URL: http://www.csufresno.edu/odin/  
Access / Availability:       ODIN is freely available on the Web.  Any data displayed by the search features in ODIN is the property of the cited authors.  

The Online Database of INterlinear text (ODIN), indexes over 31,000 instances of Interlinear Glossed Text (IGT) harvested from approximately 2,000 scholarly documents found on the Web. Linguists can locate IGT for over 640 languages, and can easily pull up any of the documents these instances were discovered in. ODIN is an Open Languages Archives Community (OLAC) data provider, so searches by language name and code can be performed on either the LinguistList (http://www.linguistlist.org) (http://www.language-archives.org/tools/search/). The ODIN website provides an additional search facility beyond language search: Advanced Search. Advanced Search allows the linguist to search IGT by Grammatical Concepts, by Language Family, and by Linguistic Constructions or Features, or any combination of the three. A description of each of the Advanced Search features follows:

  • Grammatical Concept search allows the linguist to search over the grammatical markup terms that are used in IGT, terms such as NOM, ACC, ERG, PST, FUT, etc. Rather than a simple string search, however, Grammatical Concept search normalizes the markup terms to a set of concepts, as defined in the General Ontology of Linguistic Description (GOLD, http://www.linguistics-ontology.org/). For instance, the linguist can specify a search term PastTense, and ODIN will find all instances IGT which have past encoded as PAST, PST, 3SPAST, etc. (Note: Many of the term to concept mappings are hand vetted, so there are gaps.) Further, the linguist can also ask to look for morphemes of a particular type (e.g., prefix, suffix, proclitic, enclitic). Thus, a typical query might be "ErgativeCase and PastTense expressed as suffixes," with the resulting output being a list of IGT that satisfies the query, displayed by language.
  • Language Family search allows the linguist to reduce the search space for a given query to a specific language family, where the families used are defined in Ethnologue (http://www.ethnologue.com)
  • Constructions/Features search extends Advanced Search by allowing the linguist to look for linguistic constructions and features that may not be explicitly encoded in IGT. By enriching and aligning the gloss and translation lines, ODIN can make guesses about constructions that may exist in the source language data. The current list of construction/feature queries follows (descriptions for only a few are provided; full information about all the queries can be found on the ODIN website):

    • Conditional – The conditional query relies on the English translation: if the English contains clauses that begin with either "if" or "when", then a conditional is likely. (Questions headed by "when" are ruled out.)
    • Coordination – Coordination looks for any coordinated structures by looking for the typical coordinators "and", "or" or "but" in the English translation.
    • Counterfactual – A small subset of Counterfactuals can be discovered in the English translation by the presence of "if" followed by a verb phrase in the subjunctive (marked with "were" or "would have"). Counterfactuals are also sometimes marked up in IGT.
    • Imperative – This query takes a fairly conservative approach: it looks for sentences in the English translation that begin with a verb (PTB tag "VB" or "VBP") or the second person pronoun "you", and end with an exclamation point ("!").
    • Multiple Quantifier
    • Multiple Wh
    • Negation
    • Passive – ODIN also looks for passive structures in the English translation, which are indicators of passive or passive-like structures in the source language. The template searched for is simply a form of "to be" followed by the past participle of the verb (tagged VBN).
    • Possessive
    • Question
    • Raising – Raising constructions are assumed if a raising verb, such as seem or appear, is discovered. (A more sophisticated search for raising and control constructions is being worked on.)
    • Reflexive Anaphor
    • Relative Clause
    • Sentential Negation
    • Wh and Quantifier