Thdl Essays Xml Mark-up: File Names, Dtd, And Teiheader

THL Toolbox > Developers' Zone > Web Development > THL Essays in Cocoon > THL Essays XML Mark-up: File Names, DTD, and TEIHeader

THL Essays XML Mark-up: File Names, DTD, and TEIHeader

Contributor(s): Nathaniel Grove

This page describes the basic mark-up for a new THL essay. The basic mark-up for essays is done in our modified version of TEI. The latest DTD is called external link: xtib3.dtd. This DTD must be declared at the top of the document and include an entity declaration for external links plus entities for each image. In particular, this page covers the location of files, the DTD declaration, and the TEI header. For more specific essay mark-up of divs, images, and links, see THL Essays XML Mark-up Particulars.

Name and Location of Essay Files

All XML files are found in the TDL trunk/texts/cocoon/xml in a folder named for the first letter of the author's last name. Thus, an essay by Jose Cabezon goes in trunk/texts/cocoon/xml/c/. If the folder has not yet been created, then you will need to create it.

The essay should be name as follows: {last name of author}-{descriptive title}.xml. Thus, there is cabezon-sera-monks.xml and cabezon-sera-herm-intro.xml. In the Cocoon call for these essays, the dashes can be replaced by slashes, as for example:

external link: #essay=/cabezon/sera/herm/intro

There can be up to four dashes after the author's name. The link above will work from any TDL page and does not need to have any docroot prefaced before the hash symbol (#). The above link goes to a particular TDL page only by way of example.

DTD declaration

The top of each essay must have a DTD declaration. Within this declaration there must be a reference to the external link entity file called "external-links.dtd". This contains all links to site outside of TDL. An internal link entity file ("internal-links.dtd"), for links within TDL is also included here. Both files are always found one folder above where the essays are located. If an external glossary is used, then an entity declaring the file which contains the glossary is also included. This file should reside in a subfolder called "glossaries" to the folder that contains the XML essay file. An example DTD declaration is:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE TEI.2 SYSTEM "http://www.thdl.org/global/xml/dtds/xtib3.dtd" [
    <!ENTITY % extlinks SYSTEM "../external-links.dtd" >
    %extlinks;

    <!ENTITY % intlinks SYSTEM "../internal-links.dtd" >
    %intlinks;

    <!ENTITY glossary SYSTEM "glossaries/cabezon-sera-herm-gloss.xml">
]>

TEI Header

The TEI header, <teiHeader>, contains the metadata for the essay. For TDL use, all the relevant data is found withing the <fileDesc> element, while language data, which is shared by all essays, is found in the <profileDesc> element. In both instances, common shared mark-up is imported from a separate entity file so that it can be changed/enhanced globally for all essays.

The FileDesc Element

The first element in the teiHeader is the <fileDesc> element. It contains three parts:

  1. Title Statement
  2. Publication Statement
  3. Source Description

These will be described below and the whole section will be followed by an example of a complete <fileDesc> section of XML.

Title Statement

The Title statement is found in a <titleStmt> element. It should contain <title>, <author>, and <date> tags.

The <title> tag should have language set to "eng" (or whatever the language is). It's level attribute should be set to "a" for article, and the first title should have its type set to "full". Should the article have a particularly long title that does not fit in headers. A second title element with type="brief" can be added containing an abbreviated version of the title.

The <author> element should contain a <name> element for each author of the article. Thus, a multi-authored article will have one <author> element with several <name> elements within it. Each <name> element can have an ID attribute set to the authors unique three initials, a key attribute and a reg attribute. If the reg attribute is set to "thdlparticipant", then the key attribute can have that person's TDL ID number (i.e., "per1190"). This will automatically cause the author's name to be linked to the TDL Participant page for that person. After all the author <name> elements and still within the <author> element, a <date> element should be included for the date to be displayed with the article, usually date written. All dates should be in the format of YYYY-MM-DD.

Publication Statement

The publication statement is found in <publicationStmt> element. This contains the following:

  • id: This is in an <id type="thdl">. And is assigned a number using the automatic PID generator
  • Responsibility statements: There can be several of these in <respStmt> tags. These contain a <resp> that describes what the person did, i.e. "Mark-up", "Phonetics", "Proofing", followed by a <name>. The same attributes for <name>, id, key and reg, can be applied here as described above for author. Note: In order to "turn on" the phonetics viewing option, there has to be a <respStmt> with a <resp>Phonetics</resp>.
  • An entity declaration for publication information: This is simply "&thdlpubinfo;" following the last <respStmt>. This includes a preformatted set of XML tags found in the external-links.dtd document. Should any of the THL publication information need to be changed. It can be changed in that document and the changes will cascade throughout all essays.
  • Date of Publication: This is a simple <date> element, following the publication information entity above, with the date of publication in YYYY-MM-DD format.
  • Availability: This section found in an <availability> element contains the breadcrumb links for the essay locating it logically in the THL site. It contains a paragraph (<p>) followed by a <bibl> with a type="thdlloc". In that <bibl>, there are a series of <xref>s that link to the different ancestor pages for the essay in the THL site. The first <xref> is the highest ancestor or portal page, the last <xref> is the immediate parent of the page. If the one or more of these parents have a rend attribute set to "home-link", that link or links will appear above the TOC as a back to home link. The doc attribute of each <xref> should have the name of an entity declared in the entity document that has the link to that THL page. See the link section below for more detail.

Source Description

This is a required element in the TEI header's fileDesc element. The <sourceDesc> in this case contains a single paragraph with a prose description of the source. For digital-borne essays, use "Essay written for digital publication on THL." Other descriptions can be used to fit other occasions.

Example File Description

The following is an example of a relatively complete <fileDesc> element from Jose Cabezon's essay, "People at Sera":

<fileDesc>
   <titleStmt>
      <title lang="eng" level="a" type="full">Monks at Sera</title>
      <author><name id="jic" key="per1190" reg="thdlparticipant">José Ignacio Cabezón</name> 
      <date>2004-03-01</date></author>
   </titleStmt>
   
   <publicationStmt>
      <idno type="thdl">thdl-essay:3</idno>
      <respStmt>
         <resp>Mark-up</resp>
         <name>Wilson Chen</name>
      </respStmt> 
      <respStmt>
         <resp>Edited Markup for Cocoon Delivery</resp>
         <name id="ndg" key="per1190" reg="thdlparticipant">Than Grove <date>2008-01-16</date></name>
      </respStmt>
      <respStmt>
         <resp>Phonetics</resp>
         <name id="snw" key="per2591" reg="thdlparticipant">Steve Weinberger <date>2008-02-02</date></name>
      </respStmt>
      &thdlpubinfo; 
      <date>2004-03-01</date>
      <availability>
         <p>
            <bibl n="thdlloc">
               <xref doc="thdl-places">Places</xref>
               <xref doc="thdl-mons">Monasteries</xref>
               <xref doc="thdl-sera">Sera Monastery Home</xref>
               <xref doc="thdl-sera-people" rend="home-link">Sera People Home</xref>
            </bibl>
         </p>
      </availability>
   </publicationStmt>	

   <sourceDesc>
      <p>Essay written for digital publication on THL.</p>
   </sourceDesc>
</fileDesc>