Unlike XSLT processors, OmniMark uses a streaming model to process XML/SGML.

A streaming model has advantages but it also has disadvantages. The XML flows from top to bottom. You can’t randomly access the XML tree in memory.

So how can you process an XML node in the beginning of a document if it depends on a node that hasn’t streamed yet or maybe never will come?

Let’s say you have to add an attribute at the beginning of the document if and only if there is an annex add the end of the document.

Or you want to add simply a table of contents add the beginning of the document.

One solution could be to process the XML document twice, but that is not very efficient.

OmniMark solves this problem with referents.

What is a referent?

A referent is a placeholder you can place in the output stream, and fill at a later time.

Of course the output has to be buffered until the referent is filled. You can decide when this happens: at the end of the process, at the end of every document that is processed or in a certain scope within the document e.g. every table.

A simple example

OmniMark source code, simple referent example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
process
  ;a comment line starts with a ";"
  ;'%n' is the escape charater for newline
  output referent 'c' || '%n'
  output referent 'a' || '%n'
  output referent 'b' || '%n'
  output referent 'c'
 
  set referent 'c' to 'C-value'
  set referent 'a' to 'A-value'
  set referent 'b' to 'B-value'
  set referent 'c' to 'the value is C'</pre>

<pre lang="txt">the output result
the value is C
A-value
B-value
the value is C

A referent has a uniq name, e.g. referent 'the uniq name' and can be placed at different places in the output stream e.g. referent 'c'.

A referent with the same name has the same value.

In this example the referents are output and at a “later” time we set the value for the referents.

The referent 'c' first gets the value ‘C-value’ and later on it gets the final value 'the value is C'. So a referent can have a variable value.

Making a toc using referents

OmniMark source code, adding a toc before the doc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
global stream toToc

process
  open toToc as referent 'myToc'
  do xml-parse
    scan '<doc><title>Title of doc</title>%n' ||
          '<div><title>first division title</title>' ||
          '<p>A paragraph in div1</p>' ||
          '</div>%n' ||
          '<div><title>second division title</title>' ||
          '<p>A paragraph in div2</p>' ||
          '</div>%n' ||
         '</doc>'
    output '%sc'
  done
  close toToc
 
element #implied
  output '<%q>%c</%q>'
 
element 'doc'
  output '<toc>%n'
  output referent 'myToc'
  output '</toc>%n'
  output '<%q>%n'  
  output '%c'
  output '</%q>'
 
element title
  put toToc '%t'
  put #current-output & toToc '<%q>%c</%q>%sn'

the output result

<toc>
  <title>Title of doc</title>
  <title>first division title</title>
  <title>second division title</title>
</toc>
<doc>
<title>Title of doc</title>
<div><title>first division title</title>
<p>A paragraph in div1</p></div>
<div><title>second division title</title>
<p>A paragraph in div2</p></div></doc>

We open the stream toToc and attach that stream to referent 'myToc'. So if we output data to stream ‘toToc’, it will flow into the referent ‘myToc’.

In line 31 we send the titles to two streams #current-output and toToc at the same time. #current-output is a predefined OmniMark stream, its name speaks for itself.

Determine the scope of the referents

Suppose we have an XHTML document with tables and we have to add <col> elements. We have to add as many <col> elements as there are <td> cells in a row. Because the <col> elements come before the <td> elements we can use referents to solve this problem.

OmniMark source code, scope of a referent

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
global integer numberOfColumns initial {0}

process
  do xml-parse
    scan '<html>' ||
       '<table>' ||
   '<tr><td>r1k1</td><td>r1k2</td><td>r1k3</td></tr>' ||
   '<tr><td>r2k1</td><td>r2k2</td><td>r2k3</td></tr>' ||
       '</table>'||
       '<table>' ||
       '<tr><td>r1k1</td><td>r1k2</td></tr>' ||
       '<tr><td>r2k1</td><td>r2k2</td></tr>' ||
       '</table>'||        
       '</html>'
    output '%sc'
  done

element #implied
  output '<%q>%c</%q>'
 
element table
  using nested-referents
    do
      output '<%q>%n'
      output referent 'number of columns'
      output '%n%c</%q>%sn'
    done

element tr
  set numberOfColumns to 0
  output '<%q>%c</%q>%n'
  set referent 'number of columns' to
         '<col/>' ||* numberOfColumns
 
element td
  increment numberOfColumns
  output '<%q>%c</%q>'

first table 3 col, second 2 col

<html><table>
<col/><col/><col/>
<tr><td>r1k1</td><td>r1k2</td><td>r1k3</td></tr>
<tr><td>r2k1</td><td>r2k2</td><td>r2k3</td></tr>
</table>
<table>
<col/><col/>
<tr><td>r1k1</td><td>r1k2</td></tr>
<tr><td>r2k1</td><td>r2k2</td></tr>
</table></html>

The scope of the referent 'number of columns' is within the <table> so when the end of a table is reached the referent is solved. We only need one referent although we have more than one table.

The scope is set by using the construct using nested-referents. If this construct wasn’t used every table would have the same referent and so the same number of <col> elements, in this case 2 columns because the last table has 2 columns.

Rating 3.00 out of 5
[?]