Thdl Database Technologies And Models

Projects In Thdl > Text Markup - Images > Text Markup - Emphasis > Xml At Thdl > Tei-based Dtds > Text Markup - Citations > Text Markup - Structural Divisions > Thdl Database Technologies And Models

THL Toolbox > Developers' Zone > THL Database Technologies And Models

THL Database Technologies And Models

Contributor(s): THL Staff

This site is for access to technical documentation of the various databases used within THL for storing data of various types – the dictionary, image database, and so forth. The underlying technologies include the following types of databases - see these links for general information on these databases as we use them here, the links further below are about specific applications:

THL Databases and Database Management

General Bibliographical Management

For general relatively shallow bibliographies of all types of print or non-print objects, whether English, Tibetan, or otherwise, we use a MYSQL database with PHP interface that is part of the SPT (Scout Portal Toolkit) application. SPT is a system built by the University of Wisconsin to make bibliographies of websites. In collaboration with the SPT group, we adapted it for use in making bibliographies of print materials and digital resources in addition to websites. SPT has extensive facilities for workflow management and online editing.

The documentation includes:

MYSQL data structure (PDF)
Editorial Manual (published in XML)

If online end-user bibliography web page is displaying an error saying a table is corrupt, run this command (according to Andres Montano, July 2015):

ssh sds-deployer@sdsv5.its.virginia.edu
mysql -u rubyuser
use thl_scoutportal2;
repair table APSessionData;

Literary Cataloging

For deep cataloging of Tibetan literature, we use a TEI-based XML DTD with a Tibet-specific modification.

The documentation includes:

TEI-based DTDs
Tibbl annotated DTD
Outline of data structure (Microsoft Word document)
Manual of how to fill in an entry form

Outlines of Texts (sa bcad)

Classical Tibetan literature is often marked by detailed internal structures that are often extracted as “outlines” of the content. For making such outlines of Tibetan texts, we use straight TEI-based XML markup. The “DIV” feature in TEI is used to render the nested levels. We have a manual for how to use Word Styles to create such an outline, and then a Visual Basic Macro to transform those word documents into valid TEI documents.

The documentation includes:

Tibetan Text Markup

For marking up Tibetan e-texts, we use straight TEI for XML markup. We have a manual for how to use Word Styles to create such markup, and then a Visual Basic Macro to transform those word documents into valid TEI documents.

The documentation includes:

General Scholarly Essay Markup

We use straight TEI for XML markup of scholarly essays in general. We have a manual for how to use Word Styles to create such markup, and then a Visual Basic Macro to transform those word documents into valid TEI documents.

The documentation includes:

Gazetteer

The external link: Gazetter of Tibet and the Himalayas is being completely re-engineered; here is the Gazetteer technical documentation.

Monasteries

The most important extension of our place studies into rich descriptive data belonging to a Place Encyclopedia is for monasteries. At present, we have only implemented this in a Filemaker Pro Database which simple adds on additional fields to the basic Gazetteer template. workflow purposes. This data is likely to be exported into XML, perhaps with EAC markup (see Biographical Encyclopedia for details on EAC).

The documentation includes:

Filemaker Pro Data Structure (PDF)
Editor’s Manual

Biographical Encyclopedia

We started out with a simple Word outline of our data structure and are now in the process of mapping this into XML using a new DTD called “Encoded Archival Context” (EAC). That process is not complete.

The documentation includes:

Word outline of structure
Documentation on EAC
Mapping of EAC for our use

Organizations including transmissional lineages

We are looking at using EAC, but there is nothing to report at present.

Not available

Events

We currently only have implemented this in a Filemaker Pro Database for workflow purposes. This data is likely to be exported into XML, but that has not been resolved yet.

The documentation includes:

Filemaker Pro Data Structure (PDF)
Editor’s Manual

Tibetan Dictionary

Our rich historical dictionary of the Tibetan language is a MYSQL database with a sophisticated JSP-based front end. It is used for presentation as well as extensive on-line editing with individuated metadata for each submission. It is where we keep studies of words, including rich analytical data from a linguistic point of view and passages attesting as to their historical use. We still are working out how to relate this to the Gazetteer, Biographical Encyclopedia, and other specialized resources. Our basic goal is to allow a search of the dictionary to include all of those, so that users can seamlessly find out if there is a Gazetteer entry for a place name and so forth.

The documentation includes:

Brief description of technologies used
MYSQL data structure (PDF)

Thangmi and Nepali Dictionaries

Our simple Thangmi and Nepali dictionaries are both MYSQL databases with PHP interfaces. Both are purely for presentation and lack any online editing interface.

MYSQL data structure (PDF)

Audio-Video Database

Our image database currently is a MYSQL database with a PHP interface. It is used for presentation but also has an extensive workflow and online managing interface. Our current plans are to begin to create XML-based collections using the GDMS DTD developed at the University of Virginia Library.

MYSQL data structure
GDMS annotated DTD

Transcript Database

Many of the audio-video files in the Audio-Video Database have been transcribed using XML markup based on a simple linguistic DTD created by Michel Jacobson of CNRS-LACITO. However we have not yet indexed them nor set up a search interface.

DTD for Transcript markup
Annotated DTD for Transcript markup

Image Database

Our image database currently is a MYSQL database with a PHP interface. It is used for presentation and has no online editing interface. Our current plans are to begin to create XML-based collections using the GDMS DTD developed at the University of Virginia Library.

MS Word document outlining data structure (right-click to download)
Documentation of FileMaker Pro Image Database input form (PDF - will open in a new window)
GDMS annotated DTD

THL Toolbox > Developers' Zone > THL Database Technologies And Models

This page is provided courtesy of the Tibetan and Himalayan Library.