APEX@IGP-FX

Infogrid Pacific-The Science of Information

4

Introduction to FX

What it is, objectives and applicability

Updated 10 February 2013

 

Infogrid Pacific has designed IGP:FoundationXHTML to address the widest possible spectrum of content ownership requirements for publishing and republishing valuable content into multiple formats for multiple uses, while controlling costs and complexity.

For ease of writing IGP:FoundationXHTML is abbreviated to FX in this document.

FX is full standards compliant XHTML5 file (upgraded from XHTML 1.0 Strict). The HTML must be well formed to ensure XML processing can be reliably executed.

It uses highly structured and vocabulary/syntax controlled ID and class attribute values to address the widest range of document types including books, journals, specifications, manuals, learning material, rich media, interactivity, commercial documents, marketing communication material and more.

The strategy specifically empowers content reuse and value extension. It is the core encoding schema used by IGP:Digital Publisher, and has strategies for retrodigitization, plus first time document creation.

While FX has a strong publisher orientation, that is because publishing provides the best common vocabulary for document analysis and structure. The applicability is equally valuable for all types of print documents, plus all types of electronic publishing. Importantly FX includes dynamic content such as rich media and Javascript driven interactive content.

FX does not constrain the presentation quality or ability to represent any document in a facsimile or republished form. It does simultaneously addresses the following content management and reuse demands being made on valuable content today:

  • Archiving: High value tagging and archiving, with variable and expandable metadata.
  • e-book/e-document generation: Processor driven production and customization for current formats and high readiness for formats not yet created
  • Content Objects: Deconstruction of Content into content objects for reuse and business on content. EG. Using the companion IGP:Reader/Writer eRemix functionality, and the associated product IGP:Content Fulfilment System ( for Advanced Content Object reuse and delivery).
  • Rights Control: Rights assertion on works, part-works and content objects
  • Online Presentation: Online presentation including subscription, with variable content rights assertion
  • Authoring: Online authoring of new documents for publishing and business including topic based authoring such as DITA-like content, and learning objects such SCORM.
  • Learning Object Packaging and Distribution: Creation, management and distribution of learning objects in multiple formats.
  • Facsimile Reproductions: Creation of Line-by-Line PDF (LBLTM PDF), as digital restorations (when source content is scan/OCR'ed).
  • Print Reflow: of content for reprint using XSLT:FO and CSS-3

To achieve these objectives content must be encoded in a reliable, standardized manner that allows the development and maintenance of current and future XML processors in a cost-effective manner.

Content is irreducible complex across the information continuum, and many tagging grammars are possible and exist. Content is an infinite set of data, with an infinite set of interpretations to fully address the requirements of different individuals and organizations in multiple contexts.

FX is another semantic tagging interpretation of the content universe. What makes it different is that it has been designed from a decade of digitization and content processing experience across millions of pages of academic, business and educational material, and it has been purposely designed with low total cost of ownership as a significant objective. Unlike most XML schemas which are concerned with getting content into XML and worry about the processes later, FX is concerned with instant, multiple format generation.

Why XHTML?

As we end the first decade of the 21st century, with the advances in browser technology and the emergence of XHTML as a real standard (untainted by proprietary add-ins and hacks) it is the sensible and significant choice. XHTML was choosen because it is the defacto standard for the Internet, and it will not be going away soon. It has enough flexibility to accommodate complex content management and presentation for advanced publishing requirements and many other uses. With correct attributes and metadata it can be easily converted to other XML formats or partial formats such as DocBook, DITA, TEI, NLM or proprietary XML should the need exist.

Importantly the output is instantly ready for display in all major technologies including web browsers and all ePub based eBook readers with minimal additional processing. It is fully presentable without CSS (Cascading Style Sheets), and with CSS (and optionally XLS and Javascript) FX can be rendered to the highest levels of presentation.

XHTML tagging is typically light compared to other systems. For example content of equivalent complexity tagged with FX is typically 20-25% the size of an equivalent DocBook file. This brings real benefits in storage and bandwidth requirements, but the real implication is that the larger files of more "profound" schemas contain complexity overhead that is unwarranted.

FX standardizes tagging with a rich repertoire of class statements, without loosing the inherent flexibility of XHTML. It exploits the standardization of browsers. The tagged output follows the very simple XHTML schema model and the class and ID vocabulary are human readable. This means the survivability and workability in the future are more assured because the content is it's own Rosetta Stone.

Where “Paged Media” is a more pressing business imperative, FX is presentation and print ready for CSS-3 and XSL-FO processing. It is also instantly XHTML5 compatible.

History

The techniques and grammar of FX emerged from experience gained over a decade of complex content conditioning and reuse processing plus extensive front list content processing for print and the web. Technology frameworks have not stood still and the technology and engineering complexities associated with owning XML strategies such as DocBook, TEI, DITA and custom XML schemas don't make a lot of sense when a full envelope of content value can be delivered more easily. However this is a provocative discussion and will not be expanded here.

Today's requirement is about the content and using it immediately in a wide and ever-changing range of technology frameworks. This must be done without absorbing unpredictable current and future costs.

FX is designed to deliver the lowest possible production costs, make engineering for reuse direct and easy without requiring highly specialized skills, while being able to be used in the widest range of business contexts.

For discussion on why and how XHTML can be used to deliver effective and affordable business strategies please see our white paper Foundation XHMTL for Content Reuse.

An additional driver for content publishers is to be able to synchronize backlist content with front list production and have the two merge at some common standard. Even today in 2007 typesetting system cannot generate an output XHTML that is valuable beyond the production moment. Some can even generate multiple eBook formats, but these do not have the tagging that allows and supports more advanced uses of content in an undefined future.

Self Deprecating

We do not see FX as the end of content encoding and reuse. The value of this radical reuse of existing technology is that it is; transparent, easy to use, easy to extend, customize and modify; and it has already been transposes to a better future standard such as XHTML5.

XHTML5 brings some further advances and flexibilities. Other Internet agenda's are also changing what can and should be done. The value of FX is that it expresses the purpose of content as well as the structure and is ready to be reprocessed safely to other future archive formats. Advanced content processing and management is not a goal, it is a journey. FX is a secure stop on that journey which comprehensively addresses all of todays needs and establishes a clear path for future strategies.

Comparing with Other Solutions

There are a number of good HTML tagging vocabularies, used by a number of digitization facilities for OEB like formats. The FX strategy is not about being better or worse, good or bad. It is about designing content XHTML value addition to deliver real business content strategies and benefits transparently and consistently. It is not in competition with other systems and should be evaluated only against the required values it brings to content owners.

It is about using and exploiting current and near future technologies. It is about breaking with single strategy technology dominance and opening content for a world that is changing daily. It is about consistent cost-affordable, synchronized front and back list XHTML strategies; and most of all it is about total content solutions that are affordable and save money at the ownership level, while driving revenue at the business level.

Creation and usage

IGP:Foundation XHTML is best created with IGP:Production Solutions tools, or directly in IGP:Digital Publisher.

These production environment are specifically designed to make it cost effective to process all types of content to a standardized XHTML that delivers consistent future strategies.

Currently FX has processors to allow content to be processed for the following eBook readers (excluding proprietary DRM aspects).

  • EPUB – generic
  • EPUB – Adobe Digital Editions Reader
  • EPUB – Sony BbeB
  • MS Reader
  • Mobipocket
  • Palm Reader
  • IGP:Reader/Writer Online
  • IGP:LbL PDF (Line by Line)
  • Apple ePub custom fixed layout
  • IDPF ePub3
  • IDPF ePub3 Fixed Layout

In addition it can be used for:

  • SCORM Packages
  • Custom XML generation (yes FX can be used to make TEI, DocBook and the deadly IDML).
  • Static Sites
  • WebApp packages
  • Asset Management submission packages
  • And lots more...

Supporting Technologies

IGP have a number of supporting technologies that make the long-term ownership of content easy.

IGP:ECMS Solutions provides a complete environment for archiving, managing and distributing content. Special vertical market versions are available suchas IGP:PublisherDAMS, optimized for Publisher file management and distribution; and IGP:PrintShop, optimized for printers to manage large file collections on behalf of clients.

IGP:Digital Publisher (Front List Interactive Publishing) is a multi-function environment for manipulating and managing XML content. It can be configured for front-list publishing and dynamic user driven Online publishing using the powerful eRemix features.  It can be used as a full Online front-list collaborative authoring and production environment.

IGP:Offers, Agreements and Rights Is an Offers-Agreements-Rights framework which allows the maintenance of sophisticated rights transfer agreements for Online subscription transactions and other rights management requirements.

AZARDI:Content Fulfilment is a sophisticated content fulfillment system for online, offline and and mobile delivered content in the AZARDI reading system packages.

comments powered by Disqus