Markup Of An Xml Book

THL Toolbox > Developers' Zone > Web Development > XML Books > Markup of an XML Book

Markup of an XML Book

Contributor(s): Than Grove

There are many mark-up issues involved with XML books. As with all THL XML, the basic markup is defined by our custom TEI DTD, xtib3.dtd. This is explained elsewhere. The process of displaying a book involves a process whereby the XML is transformed into HTML and embedded in an HTML page through AJAX. This complicated process is described in How an XML Book Works. This page is a place to document various markup issues related to XML books in THL.

There are three types of XML books in the THL system, which are in order of their development:

  1. A collection of essays, e.g. [Studies in Genre|external link: http://www.thlib.org/encyclopedias/literary/genres/genres-book.php#!book=/studies-in-genres/a2/]. This is a book made up of several essays, where each essay is represented by its own XML document,
  2. A Single Volume Monograph, e.g. [Chöpel's Lhokha Néyik|external link: http://local.thlib.org/places/monasteries/publications/chosphel-book.php#!book=/lho-kha-gnas-yig/wb/a1/] (Don't argue! It's a single volume book, even though THL publishes two other related books by him.)
  3. A Multi-Volume Works, e.g. the three volumes of the Kham Monastery book (under development).

Each instance of one of these types has a dedicated folder under /cocoon/books/xml/. The three examples given are found at:

  1. /cocoon/books/xml/studies-in-genres
  2. /cocoon/books/xml/lho-kha-gnas-yig
  3. /cocoon/books/xml/kham-monasteries (development only)

The name of this folder is recorded both in the XML document itself and the PHP page in Javascript as part of the $add_header. The central XML document for each book is the main.xml document found in the book's XML folder. In the case of instances 1 and 3, this document is merely a reference document that contains metadata and pointers to other documents that represent either individual articles or separate volumes. All of these secondary XML document also reside in the same folder and are referenced by their unique file name. In the case of instance 2, the main.xml document not only has the metadata but also the "body" of the text. (Here "body"is used in a loose sense to mean the whole text including <front>, <body>, and <back>.)

Pages and Pipelines

Each book has its own PHP/HTML file to create a home page for that book. These files define the header and page shell for the book as well as situating it in the appropriate place within the THL website. These pages also record the name of the book's xml folder so that the Javascript can construct the Cocoon pipeline call used to fill the body of the page. The PHP files for the three examples and their corresponding Cocoon URLs are:

TypePHP Page URLCocoon Pipeline
1external link: http://www.thlib.org/encyclopedias/literary/genres/genres-book.php#!book=/studies-in-genres/a2/external link: http://localhost:8080/cocoon/texts/books/studies-in-genres/a2/
2external link: http://www.thlib.org/places/monasteries/publications/chosphel-book.php#!book=/lho-kha-gnas-yig/wb/a1/external link: http://localhost:8080/cocoon/texts/books/lho-kha-gnas-yig/wb/a1/
3dev.thlib.org/places/monasteries/publications/kham-monasteries.php#!book=/kham-monasteries/v1/a2/external link: http://localhost:8080/cocoon/texts/books/kham-monasteries/v1/a2/?v=p

Identifying the Book

Because of the complex interplay between XML, XSLT, HTML, PHP, and JS that is required to display any section of a book on the fly using AJAX, the path to the book needs to be identified in the XML and in the PHP. In the XML this is done through an <idno> tag with type of "bkname" placed in the <publicationStmt> element:

<publicationStmt>
    <idno type="bkname">lha-sa-gnas-yig</idno>
    ....
</publicationStmt>

In the PHP code, the $add_header variable is modified to contain the cocoon folder name for that book and any volume name. The relevant parts from the Chöpel book are:

<? $add_header = ' <link rel="stylesheet" type="text/css" href="/places/culturalgeography/css/culturalgeography.css" media="all" />
				<link rel="stylesheet" type="text/css" href="/global/css/book.css" media="all" />
				…
        			<script type="text/javascript" src="/encyclopedias/literary/js/literary.js"> </script> 
				<script type="text/javascript" src="/global/js/book.js"> </script>
				<script type="text/javascript">
					…
					
					var page_title = "Lho kha gnas yig";
					book_url = "/lho-kha-gnas-yig/";       
					                                   book_link_text = "ལྷོ་ཁ་གནས་ཡིག";
					…
				</script>';

While the relevant part from the Literary Genres book is:

<? $add_header = ' <link rel="stylesheet" type="text/css" href="/encyclopedias/literary/css/literary.css" media="all" />
				<link rel="stylesheet" type="text/css" href="/global/css/book.css" media="all" />
        			<script type="text/javascript" src="/encyclopedias/literary/js/literary.js"> </script> 
				<script type="text/javascript" src="/global/js/book.js"> </script>
				<script type="text/javascript">
					var page_title = "Tibetan Literature: Studies in Genre";
					book_url = "/studies-in-genres/";
					book_link_text = "Studies in Genre";
				</script>';

Markup Specifics

Certain things need to be marked up to insure the proper display of the book. Since the Literary Genres book was the first develop, that is in some sense the default style. The other two styles, monograph and multivolume work require some specific markup. Monographs such as the Chöpel book must have a n="wholebook" on the text element, and multivolume works must have rend="book", as follows:

<text lang="tib" rend="tib" n="wholebook">

and

<text lang="tib" rend="tib book">

To get the Div2s to be numbered sequentially, one has to add the rend="sequential" to each div2. This can be done in Oxygen with a regular expression search '<div2 id="(^"+)">' to be replaced with '<div2 id="$1" rend="sequential">. You can narrow the scope with XPath parameters such as /*//body/div1, or /*//front, etc.

Provided for unrestricted use by the external link: Tibetan and Himalayan Library