APEX@IGP

Infogrid Pacific-The Science of Information

9

ePub3 Tagging Exemplars

ePub 3 Packaging-6 Tagging pattern samples make it easy to understand the package Updated: 2012-07-28

This page exists because the ePub 3 spec is weak on tagging pattern exemplars. The tagging pattern focus is on new features, rather than showing a basic comprehensive set of tagging examples and then building on that with the new features.(This is an observation, not a criticism.)

Well created tagging patterns are the best way to understand the specification and cut through the brain-bending verbosity. The tagging patterns need to be complete enough to handle all centre-line cases and a good number of edge cases. This page will be expanded or have others added with examples of more complex content including MathML, SVG, Javascript and media overlays.We will also make the ePubs available.

These exemplars may assist you in understanding the various packaging (non-content) components of an ePub 3. This consists of the OPF and the TOC file. They are taken from a real ePub 3 generated package. You can open any sample file from azardi.infogridpacific.com to examine this yourself.

Container

<?xml version="1.0" encoding="UTF-8" ?>
<container version="1.0" >
   <rootfiles>
      <rootfile full-path="OPS/package.opf" media-type="application/oebps-package+xml"/>
   </rootfiles>
</container>

The Open Package Format OPF

<?xml version="1.0" encoding="UTF-8"?>
<package  version="3.0" 
    xml:lang="en" unique-identifier="pub-id">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
  <dc:title id="title">Siddhartha - IGP EPUB3</dc:title>
  <dc:creator id="creator">Herman Hesse</dc:creator>
  <dc:language>en</dc:language>
  <dc:identifier id="pub-id">urn:isbn:siddhartha</dc:identifier>
  <dc:source>urn:isbn:siddhartha</dc:source>
  <meta property="dcterms:modified">2012-01-17T05:02:44Z</meta>
  <dc:date>2010</dc:date>
  <dc:publisher>Infogrid Pacific</dc:publisher>
  <dc:description>A novel about self discovery</dc:description>
  <dc:subject>Fiction</dc:subject>
</metadata>
<manifest>
  <item id="toc" properties="nav" href="TOC.xhtml" media-type="application/xhtml+xml"/>
  <item id="cover" href="cover.xhtml" media-type="application/xhtml+xml"/>
  <item id="cover-image" properties="cover-image" href="siddhartha.jpg" media-type="image/jpeg"/>
  <item id="s001" href="s001-BookTitlePage-01.xhtml" media-type="application/xhtml+xml"/>
  <item id="s002" href="s002-Copyright-01.xhtml" media-type="application/xhtml+xml"/>
  <item id="s003" href="s003-Dedication-01.xhtml" media-type="application/xhtml+xml"/>
  <item id="s004" href="s004-Epigraph-01.xhtml" media-type="application/xhtml+xml"/>
  <item id="s005" href="s005-AboutTheAuthor-01.xhtml" media-type="application/xhtml+xml"/>
  <item id="s006" href="s006-AboutThisBook-01.xhtml" media-type="application/xhtml+xml"/>
  <item id="s007" href="s007-Part-001.xhtml" media-type="application/xhtml+xml"/>
  <item id="s008" href="s008-Chapter-001.xhtml" media-type="application/xhtml+xml"/>
  <item id="s009" href="s009-Chapter-002.xhtml" media-type="application/xhtml+xml"/>
  <item id="s010" href="s010-Chapter-003.xhtml" media-type="application/xhtml+xml"/>
  <item id="s011" href="s011-Chapter-004.xhtml" media-type="application/xhtml+xml"/>
  <item id="s012" href="s012-Part-002.xhtml" media-type="application/xhtml+xml"/>
  <item id="s013" href="s013-Chapter-005.xhtml" media-type="application/xhtml+xml"/>
  <item id="s014" href="s014-Chapter-006.xhtml" media-type="application/xhtml+xml"/>
  <item id="s015" href="s015-Chapter-007.xhtml" media-type="application/xhtml+xml"/>
  <item id="s016" href="s016-Chapter-008.xhtml" media-type="application/xhtml+xml"/>
  <item id="s017" href="s017-Chapter-009.xhtml" media-type="application/xhtml+xml"/>
  <item id="s018" href="s018-Chapter-010.xhtml" media-type="application/xhtml+xml"/>
  <item id="s019" href="s019-Chapter-011.xhtml" media-type="application/xhtml+xml"/>
  <item id="s020" href="s020-Chapter-012.xhtml" media-type="application/xhtml+xml"/>
  <item id="s021" href="s021-Appendix-01.xhtml" media-type="application/xhtml+xml"/>
  <item id="images-001" href="hermann_hesse_1927_photo_gret_widmann_online.jpg" media-type="image/jpeg"/>
  <item id="css-001" href="css/siddhartha.css" media-type="text/css"/>
</manifest>
<spine page-progression-direction="ltr">
  <itemref idref="s001" linear="yes"/>
  <itemref idref="s002" linear="yes"/>
  <itemref idref="s003" linear="yes"/>
  <itemref idref="s004" linear="yes"/>
  <itemref idref="s005" linear="yes"/>
  <itemref idref="s006" linear="yes"/>
  <itemref idref="s007" linear="yes"/>
  <itemref idref="s008" linear="yes"/>
  <itemref idref="s009" linear="yes"/>
  <itemref idref="s010" linear="yes"/>
  <itemref idref="s011" linear="yes"/>
  <itemref idref="s012" linear="yes"/>
  <itemref idref="s013" linear="yes"/>
  <itemref idref="s014" linear="yes"/>
  <itemref idref="s015" linear="yes"/>
  <itemref idref="s016" linear="yes"/>
  <itemref idref="s017" linear="yes"/>
  <itemref idref="s018" linear="yes"/>
  <itemref idref="s019" linear="yes"/>
  <itemref idref="s020" linear="yes"/>
  <itemref idref="s021" linear="yes"/></spine>
</package>

This package demonstrates there is not a lot of shocking difference when packaging a ePub 3. From our perspective tidier packaging was a big plus. We place the cover and navigation components at the top of the manifest just because we can. Sections are in a single list with the simplest possible sequence ID number. This really helps when creating and checking the spine and

TOC.HTML

<?xml version="1.0" encoding="UTF-8"?>
<html  xmlns:epub="http://www.idpf.org/2007/ops">
     <head>
        <meta http-equiv="default-style" content="text/html; charset=utf-8"/>
        <title>Contents</title>
        <link rel="stylesheet" href="css/siddhartha.css" type="text/css"/>
    </head>
    <body>
<nav epub:type="toc">
  <h2>Contents</h2>
  <ol>
    <li><a href="s001-BookTitlePage-01.xhtml">SIDDHARTHA</a></li>
    <li><a href="s002-Copyright-01.xhtml">Copyright</a></li>
    <li><a href="s003-Dedication-01.xhtml">Dedication</a></li>
    <li><a href="s004-Epigraph-01.xhtml">Epigraph</a></li>
    <li><a href="s005-AboutTheAuthor-01.xhtml">Herman Hesse</a></li>
    <li><a href="s006-AboutThisBook-01.xhtml">Siddhartha—The book</a></li>
    <li><a href="s007-Part-001.xhtml">First Part</a>
      <ol>
        <li><a href="s008-Chapter-001.xhtml">The Son of the Brahman</a></li>
        <li><a href="s009-Chapter-002.xhtml">With the Samanas</a></li>
        <li><a href="s010-Chapter-003.xhtml">Gotama</a></li>
        <li><a href="s011-Chapter-004.xhtml">Awakening</a></li>
      </ol>
    </li>
    <li><a href="s012-Part-002.xhtml">Second Part</a>
      <ol>
        <li><a href="s013-Chapter-005.xhtml">Kamala</a></li>
        <li><a href="s014-Chapter-006.xhtml">With the Childlike People</a></li>
        <li><a href="s015-Chapter-007.xhtml">Sansara</a></li>
        <li><a href="s016-Chapter-008.xhtml">By the River</a></li>
        <li> <a href="s017-Chapter-009.xhtml">The Ferryman</a></li>
        <li><a href="s018-Chapter-010.xhtml">The Son</a></li>
        <li><a href="s019-Chapter-011.xhtml">Om</a></li>
        <li><a href="s020-Chapter-012.xhtml">Govinda</a></li>
      </ol>
    </li>
    <li><a href="s021-Appendix-01.xhtml">Appendix 1: Dharmmapada</a></li>
  </ol>
</nav>
<nav epub:type="landmarks">
  <h2>Landmarks</h2>
  <ol>
    <li><a epub:type="bodymatter" href="s008-Chapter-001.xhtml">Begin Reading</a></li>
    <li><a epub:type="titlepage" href="s001-BookTitlePage-01.xhtml">Title Page</a></li>
    <li><a epub:type="copyright-page" href="s002-Copyright-01.xhtml">Copyright Page</a></li>
    <li><a epub:type="frontmatter" href="s004-Epigraph-01.xhtml">Epigraph</a></li>
    <li><a epub:type="frontmatter" href="s005-AboutTheAuthor-01.xhtml">Herman Hesse</a></li>
    <li><a epub:type="backmatter" href="s021-Appendix-01.xhtml">Dharmmapada</a></li>
  </ol>
</nav>
    </body>
</html>

Page TOC

New and relatively exciting to ePub 3 is the Page TOC. This does insist that it is treated in a simple linear manner like a printed book. This is somewhat of a disappointment. We are using Page TOC in dozens of K-12 textbooks and hundreds of academic books to make the alignment of digital page and physical page easy. Digital purists may object to this, but it is a helpful and harmless mixed format tool, where mixed format means ePub and Print! We optionally show the print page numbers as sweet little buttons in the margin.

We experimented a lot with this. We wanted to present page numbers in Part/Chapter/Header context, but it proved to be very hard to do and for sure other readers will not be able to adapt to our approach of styles that are turned into blocks and floats with a lot of CSS.

Page TOC goes into the TOC.html file but is shown separately because the sample book above was born digital and does not have page references . Our Page TOC test book is only 30 pages long so I have used that as an exemplar rather than something with 300 pages.

Remember it is theepub:type="pages" that identifies this structure to the reading system.

nnnnnnn

Other Content Lists

In text books, academic books and even lavishly illustrated trade books there can be lists of figures, illustrations, plates, maps and so much more. The specification makes some concessions for these.

Whether it is better to generate these as TOC items, or rather insert them into the spine and refer to them from the TOC and/or Landmarks has yet to be seen.

Most TOC pages in most readers is irritiatingly far away. It's a link to the page, scroll, find something and then visit it.

We hated this experience so much AZARDI has a flyout side-TOC in all versions, Desktop, Online and WebApp. It makes jump navigation civilized.

The next release will include multiple TOCs on the flyout so you can navigate chapters, tables, images, maps and plates with equal ease.

A Note on Plates

Plate pages is a very "print" concept. Rather than putting the images at the back like a print book consider inserting them at the point of reference in the content flow.

IGP:Digital Publisher makes this easy with the exclude-print-rw and exclude-online-rw block properties. Both high-resolution print images and low-resolution online images can be in the master XHTML. The format processor knows which to include and which to put where for any given format.

comments powered by Disqus