Infogrid Pacific-The Science of Information

Part 1

F: Foundation Document Structures

There are two primary structures in an FX document:

A page template pattern

A Content Block template pattern 

Primary Content Block Group Selectors

Content Blocks are defined into common grammar groups with common or similar properties by document usage, or document type. This is a very important part of the FX method and vocabulary control. These foundation group selectors ultimately define any document.

The power of this approach system is that any of these can be "dropped" into any galley flow and will always be correctly structurally. They can be constructed independently, exist independently and be used or reused instantly in any document. With an XHTML declaration and galley wrapper they can be delivered in any online, mobile or eBook context instantly. 

The current primary group selectors  termed the foundation Content Objects are:

  1. title-block-book-rw - This is reserved for book publishing for specific design and presentation reasons.
  2. title-block-rw - This is a very important block. It inherits from its primary container and is used in all page templates and other content blocks when a title structure is requred.
  3. metadata-rw - metadata blocks are morphous. They are always inside another structure and represent metadata information about their immediate container object.
  4. block-rw - contains all inflow structures that do not move with respect to their position in the galley.  
  5. media-rw - for all types of media that is inserted or referenced from the content. This can be graphics, images, audio, video and animation, where the display context supports these.
  6. para-rw - to contain paragraphs that have special presentation properties as paragraphs. Do not confuse with independent paragraph styles.
  7. line-rw - for line properties. Leading line and decoration line.
  8. list-rw - all named lists are encapsulated in this classifier. XHTML lists are not included as they stand alone.
  9. table-rw - all tables must be in this div. No table, even a layout table can be independent of this classifier.
  10. layout-rw - used specifically for print layout properties including columns

This is extended by genre Content Objects to handle the structure and presentation of special content forms within any foundation document. Whereas the foundation Content Objects are regarded as static, this list can be extended and customized very easily with full control.

  1. poem-rw - contains detailed structural elements for advanced poetry tagging.
  2. drama-rw - contains detailed structural elements for advanced drama tagging.
  3. food-rw - contains detailed structural elements for cookbooks and health books. It included recipes, but also issues like meal plans, diet plans, etc.
  4. travel-rw - contains detailed structural elements for travel guides and similar books.
  5. equation-rw - contains all equations, both inline or in-flow.
  6. forms-rw - allows detailed online and print forms to be defined.
  7. question-and-answer-rw - general block for continuous questionaire, test and exam type content.
  8. format-processing-instructions-rw - reserved for special tagging elements to instruct alternative behaviour for specific target devices.

Note that special content forms such as poetry, drama, equations and questions and answers have been given the value of a primary Content Block selector. These are generally referred to as genre content blocks due to the very specific nature of the content they handle. They are of structural significance in their own right and do not fall into the concept of a standard text oriented document. This is one of the most useful properties of FX, its ability to consistently extend to any type of content without redefining a constraining DTD.

FX is a controlled vocabulary, that is its power. This list can be extended at any time and IGP retain the right to control the master list. But the concepts can be applied and extended in any organization that has the ability to apply the control the selector maintenance rules. That means IGP:FoundationXHTML is truely a foundation and specific publisher requirements can be easily created.

Content Blocks in Retrodigitization

From a retrodigitization perspective, content blocks are already "in the flow" and can also in the middle of paragraphs and flow across multiple pages. From an XML structural perspective retrodigitized content cannot always be reconstructed exactly like the paper parents without significantly destroying their future value. Therefore some reinterpretation and shifting of content is sometimes required.

 When blocks occur in the middle of a paragraph (for example the block was floated to the top of the page and the galley text flows past it), the block ha out of the paragraph and leave a well-formed and valid marker where its original position was. This is an insertion-point marker. This is similar to the notes strategy to ensure the product can be rebuilt to the original content positions and be ready for reflow and harvesting.

 It has to be understood that due to the infinite variety of content not every situation can be determined in advance, expecially with highly formatted documents such as text-books, and illustrated instructional or informative text.

Content Block Semantic Selectors

Text Content Blocks

The primary default named blocks are classified into groups based on their general styling requirements:

General inline text blocks. (Structural class block-rw) These generally occupy the galley width and are separated from the flow by space above, below and possibly indenting left and right. They may also have different decoration and internal text inherited properties.

  1. General. This is fall-back text block where the exact named content block is uncertain.
  2. Abstract. Use to markup a formal abstract, including its title and references.
  3. Chapter Epigraph. Use to markup any type of epigraph - text or poetry.
  4. Extract. Markup extracts.
  5. BoxedText. Create un-numbered notebox distinguished by visible borders. This is a special form of Notebox-unnumbered.
  6. Notebox. Create un-numbered highlighted regions of text. Semantically identical to BoxedText, but because the two presentation metaphors are used somewhat differently, both are

Media Content Blocks

Images and Media. (Structural class media-rw) These include figures, illustrations, maps, and general images and have optionally a numbered structure. Text may flow around them and they may be aligned left, right and center. They may also have width properties.

  1. Figures
  2. Maps
  3. Illustrations
  4. Plates
  5. Images - left, right, center - variable width.
  6. Equation (formula) (not inline)
  7. Notebox Numbered. (General class block-rw)
  8. Code Blocks Numbered. (General class blockcode-rw)This is a numbered code block and the content is inserted into a <pre> statement to allow white space and lineation.
  9. Code Block. (General class blockcode-rw) This is an un-numbered code block and the content is inserted into a <pre> statement to allow white space and lineation.
  10. Sidebar. (General class block-rw)  This remains as a special item because it is document specific and normally floats outside of margins.
  11. Tables. (General class table-rw) All tables are enclosed in a block statement. For further details see the Table section.

Block insertion-point marker

 To retain the original and exact position of a block tagged from retrodigitization content, an insertion marker can be put into the flow. This is by nature a span statement that refers to the ID of the target block. If the ID is not known, the block must be immediately after the paragraph into which the insertion marker is placed.

<span class="pi-insertion-rw” idref="block-ID”></span>

 Where it is not inside paragraph or other content the span class statement can be inside a paragraph. 

<p><span class="pi-insertion” tid="block-ID”></span></p>


© 2005-2012 Infogrid Pacific. All rights reserved.

comments powered by Disqus