Ru-yng Chang, Chao-jun Chen, Wan-Jung Lin, Ming-chorng Hwang, Cui-xia Weng , Academia Sinica  
Lexicon of Pre-Qin Bronze Inscriptions and Bamboo Scripts (LBB)  
Access / Availability:       This is a prototype and archive is available free to the public on the web.
     The Chinese Database 漢籍全文資料庫/瀚典 has been a convenient research tool for students of sinology.The marking (標誌) of this database and the establishment of lexicons in different periods will further deepen the study of philology. The data bank, which contains traditional data from various time and space, has laid a foundation for linguistic anchoring. However, more and more pre-Qin material is coming out from archaeological excavations, which includes oracle bone inscriptions from the Shang Dynasty, bronze inscriptions from the Shang to the Epoch of Confucius, bamboo scripts from the Warring States period, Qin and Han dynasty. The quantity of the newly unearthed material is substantial. The content comes directly from each period without typos or interpolation. It is a firsthand material whose time–space structure is precise, thus is the best materials for the reconstruction (or construction) of linguistic anchoring.

Take the Shang and Zhou bronze inscriptions for example. The quantity of the Shang and Zhou bronze inscriptions is enormous. In Corpus of Shang–Zhou Bronze Inscriptions (《殷周金文集成》), the largest corpus of pre-Qin bronze inscriptions, there are more than 12,000 chapters of bronze inscriptions. The newly–come–out inscribed bronze since the publication of this corpus which is collected in a forth–coming book by the bronze inscription research team in the Institute of History and Philology of the Academia Sinica is more than 1,700. This batch of material records accurately the calendar, official system, geography, warfare, law, property exchange, bestowal, hunting, sacrifice, marriage, way of addressing relatives, symbol of tribe, and international relations and so forth. The material is a faithful record of Shang & Zhou dynasty which complements the traditional historical literature. Other than bronze inscriptions, the increasing excavation of oracle bone inscriptions from the Shang and Western Zhou dynasty and bamboo scripts from the Warring States period are equally important.

In the past, confined to technology and manpower, these ancient records were not analyzed in detail. The Bronze Inscription Research Team (BIRT) under the Institute of History and Philology of the Academia Sinica (中央研究院歷史語言研究所金文資料庫工作室) gathered scholars and students of Chinese antiquity to work on these materials systematically. After several years’ effort, they have established the database of the Root Analysis of Bronze inscription characters (金文字根分析資料庫) and the Bronze Inscriptions Search Engine (青銅器銘文全文隸定及檢索資料庫) . With the foundation of existing databases, BIRT is currently working on the Lexicon of Pre–Qin Bronze Inscriptions and Bamboo Scripts (LBB), which is a part of the National Taiwan Digital Archives Program. The project aims to analyze bronze inscriptions, and in the end establish the lexicon of pre–Qin bronze inscriptions. The research team also intends to index the bamboo scripts from the Warring States period and interconnect them with the pre-Qin bronze inscriptions the classical literature from pre–Qin to Han dynasty. It will be the archaic section of the Chinese linguistic anchoring system.

2.The Structure and Function of the System
LBB is divided into three main sections– "Search", "Index", and "Management". In the "Search" part, one can enter keywords or the serial numbers of the bronzes in Corpus of Shang- Zhou Bronze Inscription to find data; even if one does not have any previous knowledge of bronze inscriptions and bamboo scripts, through "Index", they can use "word class" or more detailed content to find matching results categorically. Also, related information such as "period" and "excavated location" also helps. For the users’ convenience, the results are listed sequentially. In addition, most of the results are annotated. For instance, clinking a name will lead one to the "Pre–Qin biographical Database" (先秦人物志), through cross–examination, the identities and social relations between characters become clear. As for the "name of place", "name of nation", and "place of excavation" and various kinds of toponyms are linked to Geographic Information Solution(GIS). Users can use it to obtain a general idea of geographical information. If users are interested in the origins of the lexicons (Metadata of bronze inscription and bronze), they can also click the serial number of the bronze and be linked to "Corpus of Bronze Inscription and Bronze" (殷周金文暨青銅器資料庫), where more information can be found.

The "Management" is a section for system librarians to maintain, add information to, and renew the whole system. It is divided into "Single Lexicon Data", "Word–Class and Content Category", and "Dictionary Management". Dictionary includes "Pre–Qin biographical Database" and "The Annotation of the Origins of People History". It enables the librarians to add, delete, and modify the system through web. Also, it is convenient for the system librarians to batch modify the record of the lexicon under specific circumstances. Because the related metadata of utensils are from the "Corpus of Bronze Inscription and Bronze", to maintain the coherency and to speedup the search, the system librarians can batch analog the toponyms in GIS, and prerecord the results. Figure 1. shows the structure of the system.
the struct of LBB
Figure 1. The struct of LBB.

3.The Anticipated Difficulties and Future Development
Fornmal variation is an important characteristic of archaic Chinese characters–it is a common phenomenon before the Qin dynasty unified the writing system. The position and the number of the particles (部件) vary (The structural units of a Chinese character are called particles.) . Here is an instance. In Corpus of Shang-Zhou Bronze Inscription, the word "bao 寶" appears about 3,000 times and there are about 40 different ways to write it (See Figure 2.) . To represent a bao character of a chapetr of bronze inscription is a great headache for students of Chinese antiquities. In the past, there were five different ways to solve the problem of a missing character (缺字): use black dot (e.g. ● ) ) to represent a missing character, leave it blank and use handwriting on printouts, create personal font sets of missing characters, adding or omitting particles (e.g. 吉加吉=「」) ) on system font characters, and using a paint software to draw a missing character.


Figure 2. There are different ways to write 寶 in Corpus of Shang-Zhou Bronze Inscription

The Chinese Document Processing Laboratory (CDPL) under the Institute of Information Science of Academia Sinica in the "Intelligent Character Encoding(智慧型文字編碼) Project" developes a new tool to deal with the missing characters. With the particles (in a normal font set or additional particle set) and signs signifying the position of the particle in a compound character (ie. the missing character), a missing character will be constructed automatically by a software. Users can browse and search missing character in a regular internet environment and use Microsoft Offices as editors for missing characters. In addition, the CDPL provides "Glyph Database User Interface ( GDUI, "Glyph Database User Interface" (漢字構形資料庫操作界面)", which provides users the access to materials such as Xiao Zhuan (standardlized seal script) and database of missing characters. Incorporated with CDPL, LBB also tries to display missing bronze inscription characters through Intelligent Character Encoding system. Representing the bronze inscription characters is a completely different problem from that of representing Xiao Zhuan characters. The same character in Xiao Zhuan or in later Chinese is represented differently in different chapters of bronze inscriptions, meaning one character has more than one logograph in bronze inscription font (or fonts). However, the same analytical method can be used to analyze bronze inscription characters, because the principle of character construction is the same for both later Chinese characters and bronze inscription characters. Most of the bronze inscription characters can be analyzed and represented as particles and relational signs with the addition of some new particles. There will be a small number of characters on which only part of the character is recognizable. In GDUI, the recognizable part will be used as particle, and as a mean to find the character. Finally, there are characters which are completely unrecognizable. The only way to fond these character is looking for the appendix which will includes all uncognizable characters arranged by the number of strokes.
     LBB staff is willing to share the results of their laborious work with scholars around the world. The databases and tools developed by the team will shorten the time of research and make possible the access of pre–Qin for the general public.

