APEX@IGP-FX

Infogrid Pacific-The Science of Information

30

F9: Common Text Blocks (block-rw)

Common Blocks Include:

  1. general
  2. abstract
  3. boxed Text
  4. code Blocks
  5. epigraph
  6. extract
  7. letter
  8. notebox

Common Blocks are major content structures that are part of the continuous flow of bodytext but which have semantic and presentation differences to the primary flow. They are un-numbered sections.

Common Blocks explicitly excludes genre specific blocks which are defined within their own structures.

Common Blocks explicitly excludes any block that can be floated out of the flow.

Block General is an all purpose container for isolating content that is not specifically one of the named sections but does have presentation isolation. This is most useful in retrodigitization. The rule for usage is never tag content with a wrong named structure. If in doubt, use Block General.

Common blocks are those that are in the document flow and part of the semantic sequence of the content. They always have the attribute value block-rw as well as theirThe default common blocks are:

The primary characteristics of a Common Block are:

  1. They are independent <div> sections always that maintain their position in the galley flow.
  2. They are nearly always directly in the document root <galley> element, but can be nested if required. (Eg. For captioned images in noteboxes.)
  3. They each have their own independent XML defined vocabulary which works within the root container value.
  4. They always have their own ID. Content Blocks are often important reuse targets so need an identifiable ID and possibly generated, created or inherited metadata.
  5. They always have a primary structure and a semantic name at minimum
  6. They can be processed independently by structural name, semantic name, or any other properties if applied.
  7. They can express strongly typed genre-specific content such as travel, recipe and other structures.

 Blocks can be detail intensive, contain considerable content, and be valuable content items by themselves.

The root Structure

All Common locks must contain the block-rw identifier as the root attribute:

<div class="Block-rw"> .... </div>

Text Blocks in Detail

Block General

 Block general is provided as the general case presentation block where content looks like an extract, but the semantic is indeterminate. Applying block general isolates the content for extraction and post processing, and if required deferred detail tagging.

This can also be used as a presentation device in fiction, for example to isolate a fictional letter from the narrative.

 The main effects of a general block are to isolate the block content for additional processing if required, and to present it as an identifiable block within the primary content flow.

 It is preferrable to use block general than make a wrong assignation with a named block structure.

media-rw image-rw Standard Tagging Patterns
block-rw general-rw Standard Tagging Patterns
***** Simple *****
<div class= "block-rw general-rw ”>
  <p>Any text or other content</p>
</div>
***** Extended *****
<div class= "block-rw general-rw custom-abc ”>
    <h4>Block Header</h4>
  <p>Any text or other content</p>
  <p>Any text or other content</p>
</div>

block-rw   The primary structure type definer. This should always be positioned first in the selector sequence for human readibility and must always be present.

general-rw   The named structure definer. This should always be positioned second in the selector sequence for human readibility and must always be present.

custom-abc   A custom selector added for a template or document instance. Custom values must alway be placed after the named structure definer. This is optional. Note that the abc is a custom indicator to identify the value for a specific template or document.

<h4>   When headers are used they must start at heading level 4. Levels 5 and 6 can be used.

<p>   Paragraphs will inherit the bodytext behaviour and can be styled differently within a block

Abstract

Abstracts are generally used in academic articles, but can be used in other locations as well when defined as editorially appropriate. Abstracts are important as they are a summary of (generally) an academic article, although FX does not characterize them this way unless they are in an article section.

block-rw abstract-rw Standard Tagging Patterns
***** Simple *****
<div class= "block-rw abstract-rw ”>
    <h4>Abstract</h4>
  <p>Any text or other content</p>
    <p class="keywords-rw"> Keyword list </p>
</div>

block-rw   The primary structure type definer. This should always be positioned first in the selector sequence for human readibility and must always be present.

abstract-rw   The named structure definer. This should always be positioned second in the selector sequence for human readibility and must always be present.

<h4>   Generally the abstract header is required to be identifiable in generated TOCs so should appear at the correct required sub-section level.

<p>   Paragraphs will inherit the bodytext behaviour and can be styled differently within a block. In an academic article it is normally set with indented margins and a different font size, but can also appear as the first section in the bodytext flow.

keywords-rw   Abstracts often contain a keyword list. This is optional.

Boxed Text

 Boxed text and un-numbered noteboxes are inserted into the galley flow, are the same width as the galley or column text, and do not float. The two styles are synonymous with only presentation variations being different. Boxed text has a simple border line rule. Noteboxes have a coloured or shaded background.

If Titles and headers are required in either Boxed Text or Noteboxes, they should start at <h4>, and will not be part of number generation in the standard processing. Both Boxed text and noteboxes can contain captions, and caption numbering.
block-rw boxed-rw Standard Tagging Patterns
***** Simple *****
<div class= "block-rw boxed-rw ”>
    <h4>Boxed Text Header</h42>
  <p>Any text or other content</p>
    <p class="keywords-rw"> Keyword list </p>
</div>

block-rw   The primary structure type definer. This should always be positioned first in the selector sequence for human readibility and must always be present.

general-rw   The named structure definer. This should always be positioned second in the selector sequence for human readibility and must always be present.

custom-abc   A custom selector added for a template or document instance. Custom values must alway be placed after the named structure definer. This is optional. Note that the abc is a custom indicator to identify the value for a specific template or document.

<h4>   When headers are used they must start at heading level 4. Levels 5 and 6 can be used.

<p>   Paragraphs will inherit the bodytext behaviour and can be styled differently within a block

Code Blocks

Computer code blocks use the <pre>  element to preserve white space.

Code blocks may be numbered or not. If they are they will probably have Chapter and sequence numbers. For reflow it is important that these are tagged independently.

Books with intensive code may use a number of different presentation devices for different types of code. For example Code examples may be in a framed and shaded box structure, while fragments are shown on the page background. Different programming languages in the same document may have different presentation styles.

There are nine additional selectors available to give code blocks different styles. This is generally enough for most books (in fact three are generally enough!)

One of the objectives of eBook and Online code is that it can be cut-and-paste from the book (if not DRM locked). Therefore the structure should support linebreaks to ensure it moves cleanly to a text editor for the best end-user experience.

 Note that the entire code must be in a single <pre> element. Multiple pre elements can cause presentation and manipulation problems.

block-rw code-rw / code-list-rw Standard Tagging Patterns
<div class="block-rw code-rw”> 
    <pre>
        line of code 
           line of code 
              line of code 
    </pre> 
</div>
<div class="block-rw codelist-rw”> 
    <h4>Code block title </h4>
    <pre>
     line of code 
        line of code 
           line of code 
    </pre>
  <p class="caption-rw"><span class="num-rw">1</span>. Caption text </p> 
</div>

block-rw   The primary structure type definer. This should always be positioned first in the selector sequence for human readibility and must always be present.

code-rw   The named structure definer. This should always be positioned second in the selector sequence for human readibility and must always be present.

codelist-rw   The named structure definer. This should always be positioned second in the selector sequence for human readibility and must always be present.

<h4>   When headers are used they must start at heading level 4. Levels 5 and 6 can be used.

<pre>   Code is text in a preformated statement. This allows arbitrary layout but cannot be styled.

Epigraph

 Epigraphs usually appear on their own page at the beginning of  a book (sometimes in place of the dedication page) or immediately following the chapter/part title information. Visually they appear to be part of the title block and should be positioned inside a title block when used in this manner.

 Depending on the document style, both abstracts and epigraphs can be placed inside a title-block. This has advantages if chapter title-blocks are being harvested, especially for academic articles - all the content is extractable for processing from a single block.

 There are two primary types of epigraphs, normal epigraph and poem epigraph. Epigraph text has no extractable or additional value in the context of the text other than the fact that it is an quoted (or anonymous) source. Therefore there is no internal detail tagging beyond presentation styling.

 Older books - Dicken's is a good example - use a summary block with a Chapter title-block. This is not strictly an epigraph, rather its purpose is a teaser, or section content summary. These can be tagged as epigraphs.

block-rw epigraph-rw Standard Tagging Patterns
***** Extended *****
<div class= "block-rw epigraph-rw ”>
    <p>Any text or other content</p>
    <p class="attribution-rw">Who wrote it</p>
    <p class="source-rw">Where it came from</p>
</div>

block-rw   The primary structure type definer. This should always be positioned first in the selector sequence for human readibility and must always be present.

epigraph-rw   The named structure definer. This should always be positioned second in the selector sequence for human readibility and must always be present.

<h4>   When headers are used they must start at heading level 4. Levels 5 and 6 can be used. In extracts it is better practice to use styled paragraphs.

<p>   Paragraphs will inherit the bodytext behaviour and can be styled differently within a block

Extract

 Extract blocks are very common in academic content. An Extract block can contain anything that can be contained in a <div>, but it should of course be an extract from something else. Extracts should not contain formal headers, and must not use headers above<h3>-<h6> to ensure they will not reflect in the source document structure. It is strong advised to use styled header elements where header-like styles are required in a block extract.

 Likewise, drama and poetry extracts should not be detail tagged, but tagged for presentation only when present in an extract. As an extract, the content is not part of the source document and should only be extractable as an extract.

 Attributions and sources are tagged using inline styling as they have no value if disassociated from the main extract text. For specific custom use, a paragraph style can be applied.

block-rw extract-rw Standard Tagging Patterns
***** Block Extract *****
<div class= "block-rw extract-rw ”>
  <p>Any text or other content</p>
  <p class="attribution-rw">This is the attribution</p>
</div>
**********

block-rw   The primary structure type definer. This should always be positioned first in the selector sequence for human readibility and must always be present.

general-rw   The named structure definer. This should always be positioned second in the selector sequence for human readibility and must always be present.

custom-abc   A custom selector added for a template or document instance. Custom values must alway be placed after the named structure definer. This is optional. Note that the abc is a custom indicator to identify the value for a specific template or document.

<h4>   When headers are used they must start at heading level 4. Levels 5 and 6 can be used.

<p>   Paragraphs will inherit the bodytext behaviour and can be styled differently within a block

Letter

Letters have their own set of named paragraphs to allow the construction of various parts of a letter. This Sender, Salute, Subject, postscript

block-rw notebox-rw Standard Tagging Patterns
**** Notebox Extended ******
<div class= "block-rw letter-rw ”>
  <p class="sender">Sender text</p>
  <p class="salute">Salute text</p>
  <p class="subject">Subject text</p>
  <p>This is the letter body text</p>
  <p class="postscript">Postscript text</p>
</div>

block-rw   The primary structure type definer. This should always be positioned first in the selector sequence for human readibility and must always be present.

letter-rw   The named structure definer. This should always be positioned second in the selector sequence for human readibility and must always be present.

sender-abc   Sender information. This can be multi-line and can use one paragraph with line-break elements or multiple paragraphs for each line.

salute-rw   The salute is normally one short left-aligned line. Eg: Dear John, 

subject-rw   The subject is normally one short center-aligned line. Eg: Your Pay Rise

p   unstyled p elements are used for the letter bodytext.

postscript   Paragraphs will inherit the bodytext behaviour and can be styled differently within a block

Notebox un-numbered

 Boxed text and un-numbered noteboxes are inserted into the galley flow, are the same width as the galley or column text, and do not float. The two styles are synonymous with only presentation variations being different. Boxed text has a simple border line rule. Noteboxes have a coloured or shaded background.

If Titles and headers are required in either Boxed Text or Noteboxes, they should start at <h4>, and will not be part of number generation in the standard processing. Both Boxed text and noteboxes can contain captions, and caption numbering.
block-rw notebox-rw Standard Tagging Patterns
**** Notebox Extended ******
<div class= "block-rw notebox-rw ”>
  <p>Any text or other content</p>
  <p><span style="text-align: right; text-style: italic;">
      This is the attribution</p>
</div>**********

block-rw   The primary structure type definer. This should always be positioned first in the selector sequence for human readibility and must always be present.

general-rw   The named structure definer. This should always be positioned second in the selector sequence for human readibility and must always be present.

custom-abc   A custom selector added for a template or document instance. Custom values must alway be placed after the named structure definer. This is optional. Note that the abc is a custom indicator to identify the value for a specific template or document.

<h4>   When headers are used they must start at heading level 4. Levels 5 and 6 can be used.

<p>   Paragraphs will inherit the bodytext behaviour and can be styled differently within a block

Sidebar

Sidebar is a special style of notebox that is floated left or right in relation to its text. Usually it is only used on a print page where there is sufficient margin space created for the sidebar.

Sidebars can be problematic in both reflow and eBooks in that their positioning is usually mechanical.

block-rw sidebar-rw Standard Tagging Patterns
<div class= "block-rw sidebar-rw”>
  <p>Any genera text</p>
</div>

block-rw   The primary structure type definer. This should always be positioned first in the selector sequence for human readibility and must always be present.

sidebar-rw   The named structure definer. This should always be positioned second in the selector sequence for human readibility and must always be present.

custom-abc   A custom selector added for a template or document instance. Custom values must alway be placed after the named structure definer. This is optional. Note that the abc is a custom indicator to identify the value for a specific template or document.

<h4>   When headers are used they must start at heading level 4. Levels 5 and 6 can be used.

<p>   Paragraphs will inherit the bodytext behaviour and can be styled differently within a block

Floating Noteboxes

Noteboxes are particularly important in document genres such as textbooks. A notebox is a stand-alone block that may or may not be numbered and referenced to from the text with an explicit number reference or a positional reference.

Noteboxes can be styles with rules, background colours, special header presentation and structures independent from the surrounding text. They often contain icons, images and other special content blocks such as code, poetry, maths and other structures.

There can also be special more rare constructions, such as box-notes with the references and the notes in the same container. These have to be easily achieved at least at the template or document level.

In print they may be styled as multi-column and may change their position in the document flow based on flow rules. These presentation features are also available in some controlled Online and reader viewing environments.

Headers in noteboxes always start at <h4> so if the CSS collapses the headers inherit a reasonable level of emphasis from the default presentation, and they do not impinge on the high-level galley headers for TOC generation.

Where complex material such as textbooks and magazines have different/multiple textbox styles, the stylesheet and tagging should be extended with a  template or project specific custom selector value. Additionally or alternatively, ID's can be used where a stylesheet must target an individual box in a document. In both cases other rendering environments can use or ignore the project selectors depending on their capabilities.

General Code Example

**********
<galley-rw>
  <p> ... </p>
  <div class=“block-rw notebox-rw ” id=“a123”>
    <h4> Main header</h3>
      <p>... </p>
      <h5>Text Header in box</h5>
      <p>... </p>
      <h6>Text Header in box</h6>
      <p>... </p>
      <div class= “media-rw figure-rw”> .... </div>
    </div>
  </div>
  <p> ... </p>
</galley-rw>	
**********

IGP:FLIP

Noteboxes can be floated for print output

Noteboxes can be moved to an insertion point in print

Numbered Notes and Note References

Notes - Overview

Notes are one of the more difficult structures to handle when it comes to creating something that both a human and a processor can use, and in a front-list and retrodigitization environment. Notes are one of the few structures where the retrodigitization production strategy is easier than the front-list generation strategy once note item positioning strategies are defined.

 A note is an author or editorial, reference or explanation of document content that is referred to from the flow by a reference or location structure. There are multiple types of notes and the ad-hoc treatment of notes by authors and editors is a major issue for both front-list and retrodigitization production activities.

 One of the more disappointing issues with ePublishing is notes treatment in eBooks and Online content. Publisher technical and editorial teams tend not to move away from paper metaphors. Footnotes and endnotes are mechanisms designed for paper environments and are expressions of the limitations of paper. Reader technology developers tend to have the same limited thinking and replicate paper limitations in their e-content presentation environments.

Note Structures

The main note structural forms found in documents, especially books are:

  1. End notes: Volume, Book, Document, Part, Chapter, Topic, Section
  2. Footnotes: By definition a print structure only
  3. Margin notes: Available if margins are present

There are other “special note styles” that can be encountered from time to time which add to the complexity of handling notes:

  1. Line notes in structures such as poetry and drama. The notes refer to a specific line of text in a line numbered
  2. Verse notes
  3. Manuscript reference notes

In addition there are other notes styles which are not formal publisher structures but exist anyway and may or may not have to be captured in some form:

  1. Reviewer comments (for business style documents)
  2. Hand-written margin and other informal notes

The Notes Elements

The Foundation XHTML vocabulary for general notes structures is references, pointers and items.

Note Pointer. A note pointer is a number or symbol in text that matches a note reference. A Note pointer must contain a reference the correct note reference ID.

Note Reference. A note reference is the fixed number or symbol that always proceeds the note text. It always remains with the note and gives it a unique identification. A note reference must have a document unique ID generated.

Note Item. A note item is the note content. It always has a Note Reference to allow it to be identified in the reading context.

 Notes of all types are referred to by a reference which may be a number, a symbol or positional reference.

Note references may be any of:

  1. sequential note numbers or symbols
  2. stanza or verse (religious text) numbers,
  3. line numbers in poems, acts and other content structures,
  4. catalog or manuscript reference
  5. Compound and complex terms.

Production systems must correctly identify the correct number type and not rely on sequential processing.

A Note Reference identifies a Note Item as unique in a document or section, and allows a Note Pointer to be associated with a specific Note Reference. A Note Reference can occur only once in a document, but can have multiple Note Pointers resolving to it.

Note Positioning

Note: the Infogrid Pacific tools do not support book end-notes. This is counter-indicated in multi-format content strategies and editorial teams should not use accumulated document end-notes as the final presentation strategy as it causes excessive unwarranted inter-section linking in e-book formats.

Depending on the format, notes of all types can be presented:

In traditional print content positions, except footnotes behave like section end-notes

There are many format processing options required depending on the content reuse.

Note Harvesting and Presentation

Notes must be able to be harvested, repositioned and reflowed as required from Foundation XHMTL. For example if a Chapter is extracted as a content object, and the notes text is in book end-notes, the notes must move with the source document to ensure content coherence. This strategy should apply to all referencing structures.

To achieve this easily, and with the minimum of processing in retrodigitization, document end-notes are copied to their end-section location in text, effectively becoming "chapter end-notes", as well as being left in their source position in the document. The chapter end-notes are linked, the book end-notes are left in position with presentation attributes only (they do not link).

Notes need to be regarded as part of the original text. Therefore if a paragraph is extracted for some purpose, either the note reference should be processed out, or the note should accompany the paragraph. This is easier to accomplish if the note item is in the same structure as the note reference.

Note reverse linking

Due to the limitations of some e-reader devices which do not have "back" navigation buttons, it has become necessary for files to be processed with reverse links. That means, the Pointer links to the Reference, and the Reference links to the Pointer.

This is possibly a passing requirement for ePub readers as they become more sophisticated. It has the negative effect of un-necessarily doubling the number of links in a particular text, and where systems do have maintained navigation lists, it creates link-loops.

Meanwhile reverse note linking must be a processing option supported by the general XHTML. 

Note Item Containers

Note items are treated as a named list and are always contained in a named list block. This enables them to be moved and manipulated as and when required. Paragraphs or divs inside a notes list are assumed to be independent notes for processing.

**********
<div class=“notes-rw”>
  ... List of note items ....
</div>
<div class=“notes-footnotes”>
  ... List of note items ....
</div>
**********

Note Item Structure

Notes are generally single paragraph structures, but can be multi-paragraph and complex items and contain all primary content structures. A <p>...</p> inside a note list container is a note. A first level <div>...</div> inside a note list container is a note.

There are two treatments for a note.

Simple Note

A Note item may be a single paragraph.

1. This is a note item with a reference number proceeding it. The reference number correlates with the pointer number in the text.

Complex Note

A Note item can be an XHTML <div> element containing any other content structures.

2. This is a note item with a list and an extra paragraph and a bibliographic statement.

There are three considerations the reader should take into account when reading this:

  1. Consideration One. Important.
  2. Consideration Two. Less important.
  3. Consideration Three. Trivial

When a note is a single paragraph, the first element inside the paragraph must be the note number without punctuation inside the number:

<p><span class=”num-note-rw”>23</span>. A simple note</p>

When a note is a div, it must contain a paragraph as the first element, and the first element inside the paragraph must be a reference number or character. This structure is for harvesting rather than presentation, although the CSS inheritance model can be useful for advanced presentation requirements.

<div class=“notes-book-end”>
  <div class=“list-notes”>
<!-- a simple note -->    
    <p class=“note” id=“xxx”><span class=“num-note-ref”>1</span>. A simple note</p>
<!-- a complex note containing an extract block -->
    <div class=”note” id=“xxx” >
       <p><span class=“num-note-ref”>1</span>. A simple note</p>
          <div class=“block-extract”>
             <p>Block extract text</p>
          </div>
    </div>
</div>

Frontlist notes

Frontlist production notes are different from retrodigitization notes in that within the FX the notes are not flowed into their item containers. They exist only at their reference point on the page.

This has the significant advantage in enabling notes to move with their text in a variable content environment.

Therefore a Writer document will always have notes inserted at the reference point OR an insertion link to the note which can be anywhere in an associated structure.

The presentation behaviour and position of the note will be defined for the Online and Print versions depending on specific styling issues. For example the Online version can be a roll-over or part of an annotations structure, while the Online version gathers notes as chapter end notes.

<p>I assert, certainly, that, when thus treated as formal logic, our faculties contr
adict our a posteriori knowledge.</p>
<pclass=“note-item”>This is the note item text that will be 
     moved somewhere and become a paragraph and leave a note reference
     generated by a counter in this position </p>
<p>Yet the objects in space and time<span class= “insertion-point”></span>, 
     in natural theology<span class= “insertion-point”></span>, 
     are the mere results of the power of our a priori knowledge.
</p> 
<div class= “note-item”>
  <p>Really complex note item is placed following the paragraph.</p>
  <p> If is more than one, they are placed in their presentation sequence.</p>
</div>

© 2005-2012 Infogrid Pacific. All rights reserved.

comments powered by Disqus