How An Xml Book Works

THL Toolbox > Developers' Zone > Web Development > XML Books > How an XML Book Works

How an XML Book Works

Contributor(s): Than Grove

A typical URL to a page of an XML book in THL uses the hash value to trigger an AJAX call to a PHP script that pipes in the output of a Cocoon transformation into a specific portion of the page. The two major types of calls are for a page of the book that goes in the main content body of the html page and for the TOC of the book which goes into the side column.

An example URL is: external link: http://www.thlib.org/bellezza/#!book=/bellezza/wb/b1-1-4/

This goes through a three step process:

  1. Javascript files interpret the hash value and convert that into two calls, one for the body of the page and one for the TOC. These are respectively:
  2. The PHP book reader converts the url parameter into a cocoon call on the localhost (i.e. whatever machine the php script is on) and loads that document from cocoon returning it as the result of the respective ajax call. The Cocoon calls/urls are respectively:
  3. Cocoon directs the call through a specific pipeline interprets it to determine a source XML file and a XSL transformation to apply. It returns the result usually xhtml.

Javascript Files

THL uses JQuery as its core JS library and all the custom THL scripts rely on JQuery. The main javascript file for Ajax calls in THL is /global/js/class_external.js. This is used by JIATS, essays, wikis, books, and catalogs. Each of these components also has their own specific JS file. In this case, it is book.js. When a call is made to #!book, these two JS files determine that it is a call for the "Book" component and call the appropriate reader with AJAX calls. The results returned from the PHP reader are inserted in the appropriate place in the page and various formatting/JS manipulation is applied to them after insertion.

PHP

As demonstrated by the urls above the PHP reader for books is /global/php/book_reader.php. The param sent to this script is called "url". It contains the pipeline call for Cocoon. Because Cocoon is a servlet in Tomcat, its URL would normally be external link: http://www.thlib.org:8080/cocoon… or with an alias external link: http://texts.thlib.org/. In either case, JS in browsers interpret this as a different server from the host page, external link: http://www.thlib.org/… So, any AJAx call to a cocoon page is considered a cross-domain call, which are prohibited for security reasons. AJAX is only allowed to read pages from the same domain as the main page. For this reason, we have created a class of PHP scripts that reside on the same server as the THL pages, in this case external link: http://www.thlib.org/global/php/book_reader.php. Using PHP file_get_contents, the Cocoon pipeline can be read and output from the PHP script, thus making it be on the same server as the including page. So, the PHP file acts as an intermediary between the AJAX calls and the Cocoon Transformations.

Cocoon

Cocoon works on the model of pipelines. A pipeline is a action defined by a url. Pipelines are defined in a project specific document with the name sitemap.xmp that resides at the top level of the project. In THL's version of Cocoon, sitemaps have been set up to be cascading. That is, there is an overall general Cocoon sitemap at external link: http://localhost:8080/cocoon/texts/sitemap.xmap that defines general pipelines. Then, each THL Cocoon project has it's own subfolder so that the sitemap for books is at external link: http://localhost:8080/cocoon/texts/books/sitemap.xmap. The more specific sitemap takes precedent over the more general one. In the present example of book, the call for the body of the page is external link: http://localhost:8080/cocoon/texts/books/bellezza/wb/b1-1-4/. The beginning of the URL up to and including /books/ determines that it is in the book project and invokes the /books/sitemap.xmap. The rest of the URL triggers the following pipeline in that sitemap.xmap file:

<!-- Chapter for wholebook texts where all book is in one file -->
			<map:match pattern="*/wb/*/">
				<map:generate src="xml/{1}/main.xml"/>
				<map:transform src="xsl/book-chapter.xsl">
					<map:parameter name="bkid" value="{1}"/>
					<map:parameter name="cid" value="{2}"/>
				</map:transform>
				<map:serialize type="html"/>
			</map:match>

In this pipeline, the first asterisk (*) in the pattern becomes {1} in the src and value attributes, while the second becomes {2}. So, in our example call, the XML file "generated" would be /books/xml/bellezza/main.xml. There are two type of books in THL, books where the main.xml file is only a structural TOC of the book and contains no contents but instead has links to separate files that contain each of the individual sections and books where the "whole book" is contained in a single document. Pipelines with /wb/ in them are for the later type of book. (An example of the first type would be the Tibetan Literary Genres book.) To this XML file, the XSL file at books/xsl/book-chapter.xsl is applied and the results are "serlialized" as html.

Provided for unrestricted use by the external link: Tibetan and Himalayan Library