<?xml-stylesheet type="text/xsl" href="omstd.xsl"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2CR1//EN"
                    "docbook/docbookx.dtd"
[
<!-- 
 docbook customisations:
    add MathML
    allow sidebar in figures (used for change log)
    add author attribute to sidebar
    add xml:space (to correct for IE bug, dropping spaces)
-->
<!ENTITY % inlineobj.char.class	"math">
<!ENTITY % local.figure.mix "|sidebar">
<!ATTLIST sidebar author CDATA #IMPLIED>
<!ATTLIST book xml:space (default|preserve) #IMPLIED >
<!--
  MathML DTD (somewhat simplified)
-->
<!ELEMENT math (mrow|mn|mi|mo|msub|msup|mtext|mspace)+>
<!--
  IE Bug doesn't allow this, so switch to mml namespace via stylesheet
  xmlns CDATA #FIXED "http://www.w3.org/1998/Math/MathML" 
-->
<!ATTLIST math
  display CDATA #IMPLIED
>
<!ELEMENT mrow (mrow|mn|mi|mo|msub|msup|mtext|mspace|varname|systemitem)+>
<!ELEMENT mn (#PCDATA)>
<!ELEMENT mi (#PCDATA)>
<!ATTLIST mi
  mathvariant (bold) #IMPLIED>
<!ELEMENT mo (#PCDATA)>
<!ELEMENT msup ((mn|mi|mrow),(mn|mi|mrow))>
<!ELEMENT msub ((mn|mi|mrow),(mn|mi|mrow))>
<!ELEMENT mtext (#PCDATA)>
<!ELEMENT mspace EMPTY>
<!ATTLIST mspace width CDATA #IMPLIED>

<!--
  Abbreviations used in this document
-->
<!ENTITY OM "<emphasis>OpenMath</emphasis>">
<!ENTITY exml "<acronym>xml</acronym>">


<!ENTITY digits "0-9">
<!ENTITY exadigits "0-9A-F">
<!ENTITY lcalpha "a-z">
<!ENTITY ucalpha "A-Z">
<!ENTITY sign "[+-]"><!-- dpc: Correct regxp for + or - -->
<!ENTITY zsp "">
<!ENTITY longrightarrow "<mo>&#8594;</mo>"><!--dpc: short, actually -->

<!ENTITY varnamechar "+=(),-./:?!#$&#37;*;@[]^_`{|}"><!-- dpc: remove `TeX error?  -->

]>
<book xml:space="preserve">
<title>The &OM; Standard</title>
<bookinfo>
<releaseinfo>1.01</releaseinfo>
<author><firstname>The &OM; Esprit Consortium</firstname></author>


<editor><firstname>O.</firstname><surname>Caprotti</surname></editor>
<editor><firstname>D.</firstname><othername>P.</othername><surname>Carlisle</surname></editor>
<editor><firstname>A.</firstname><othername>M.</othername><surname>Cohen</surname></editor>
<date>April 2002</date>

<copyright>
<year>2000&#8211;2002</year>
<holder>The OpenMath Society</holder>
</copyright>

<abstract>
<para>This document proposes &OM; as a standard for the communication of
  semantically rich mathematical objects. This draft of the &OM; 
  standard comprises the following: a description of &OM; objects, the
  grammar of &exml; and of the binary encoding of objects, a
  description of Content Dictionaries and an &exml; document type
  definition for validating Content Dictionaries. The non-normative
  <xref linkend="cha_his"/> of this document briefly overviews the history
  of &OM;.</para>
</abstract>
</bookinfo>
  
<toc/>
<lot><title>List of Figures</title></lot>

<chapter id="cha_his">
<title>&OM; Movement</title>
<sidebar revision="1999/08/24" author="OC"><para>Changed title</para></sidebar>

<para>This chapter is a historical account of &OM; and should be regarded
as non-normative.</para>

<para>&OM; is a standard for representing mathematical objects, allowing
them to be exchanged between computer programs, stored in databases,
or published on the worldwide web.  While the original designers were
mainly developers of computer algebra systems, it is now attracting
interest from other areas of scientific computation and from many
publishers of electronic documents with a significant mathematical
content.  There is a strong relationship to the MathML recommendation
<citation>MathML_98</citation> from the Worldwide Web Consortium, and a large
overlap between the two developer communities.  MathML deals
principally with the <emphasis>presentation</emphasis> of mathematical objects, while
&OM; is solely concerned with their semantic meaning or 
  <emphasis>content</emphasis>.  While MathML does have some limited facilities for
dealing with content, it also allows semantic information encoded in
&OM; to be embedded inside a MathML structure.  Thus the two
technologies may be seen as highly complementary.</para>

<section id="sec_hist">
<title>History</title>

<sidebar revision="1999/07/16" author="DPC">
<para>Reword to reflect birth of OM Society</para>
</sidebar>
<para>&OM; was originally developed through a series of workshops held in
Zurich (1993 and 1996), Oxford (1994), Amsterdam (1995), Copenhagen
(1995), Bath (1996), Dublin (1996), Nice (1997), Yorktown Heights
(1997), Berlin (1998), and Tallahassee (1998).  The participants in
these workshops formed a global &OM; community which was coordinated by a
Steering Committee and operated through electronic mailing groups and
ad-hoc working parties.  This loose arrangement has been formalised
through the establishment of an &OM; Society.  Up until the end of
1996 much of the work of the community was funded through a grant from
the Human Capital and Mobility program of the European Union, the
contributions of several institutions and individuals.  A document
outlining the objectives and basic design of &OM; was produced (later
published as <citation>Abbott_Leeuwen_Strotmann_98</citation>).  By the end of 1996
a simplified specification had been agreed on and some prototype
implementations have come about <citation>Dalmas_Gaetano_Watt_97</citation>.</para>

<sidebar revision="1999/07/16" author="DPC">
<para>Extend History slightly</para>
</sidebar>
<para>In 1996 a group of European participants in &OM; decided to bid
for funding under the European Union's Fourth Framework Programme for
strategic research in information technology.  This bid was successful
and the project started in late 1997.  The principal aims of the
project are to formalise &OM; as a standard and to develop it
further through industrial applications; this document is a product of
that process and draws heavily on the previous work described earlier.
&OM; participants from all over the world continue to meet
regularly and cooperate on areas of mutual interest, and
recent workshops in Tallahassee (November 1998) and Eindhoven (June
1999)  endorsed  drafts of this document as the current &OM; standard.</para>



<sidebar revision="1999/07/16" author="DPC"><para>Final conclusion paragraph removed</para></sidebar>
</section>

<section id="sec_omsoc">
<title>&OM; Society</title>

<sidebar revision="1999/08/24" author="OC">
<para>New section</para>
</sidebar>
<para>In November 1998 the &OM; Society has been established to coordinate
all &OM; activities. The society is based in Helsinki, Finland and is
steered by the executive committee whose members are elected by the
society. The official web page of the society is
<ulink url="http://www.openmath.org">http://www.openmath.org</ulink>.</para>
</section>

</chapter>

<chapter id="cha_int">
<title>Introduction to &OM;</title>




<para>This chapter briefly introduces &OM; concepts and notions that are
referred to in the rest of this document.</para>

<section id="sec_om-arch">
<title>&OM; Architecture</title>


<figure id="fig_om">
    <title>The &OM; Architecture</title>
    <graphic fileref="om-arch" depth="500" width="700"/>
</figure>

<para>The architecture of &OM; is described in <xref linkend="fig_om"/> and
summarizes the interactions among the different &OM; components.
There are three layers of representation of a mathematical object
<citation>OM_98</citation>. A private layer that is the internal representation used
by an application.  An abstract layer that is the representation as an
&OM; object. Third is a communication layer that translates the &OM; 
object representation to a stream of bytes. An application dependent
program manipulates the mathematical objects using its internal
representation, it can convert them to &OM; objects and communicate
them by using the byte stream representation of &OM; objects.</para>
</section>

<section id="sec_intro-obj">
<title>&OM; Objects and Encodings</title>


<sidebar revision="1999/08/26" author="OC"><para>Moved this section up, to mirror chapter sequence</para></sidebar>


<para>&OM; objects are representations of mathematical entities that can be
communicated among various software applications in a meaningful way,
that is, preserving their <quote>semantics</quote>.</para>

<para>&OM; objects and encodings are described in detail in
<xref linkend="cha_obj"/> and <xref linkend="cha_enco"/>.</para>


<sidebar revision="1999/08/24" author="OC"><para>Note on encodings and possibility of other encodings</para></sidebar>

<para>The standard endorses encodings in XML and binary format. These are
the encodings supported by the official &OM; libraries. However they
are not the only possible encodings of &OM; objects. Users that wish
to define their own encoding using some other specific language (e.g.
Lisp) may do so provided there is an effective translation of this
encoding to an official one.</para>
</section>

<section id="sec_intro-cd">
<title>Content Dictionaries</title>


<para>Content Dictionaries (CDs) are used to assign informal and formal
semantics to all symbols used in the &OM; objects. They define the
symbols used to represent concepts arising in a particular area of
mathematics.</para>

<para>The Content Dictionaries are public, they represent the actual common
knowledge among &OM; applications.  Content Dictionaries fix the
<quote>meaning</quote> of objects independently of the application.  The
application receiving the object may then recognize whether or not,
according to the semantics of the symbols defined in the Content
Dictionaries, the object can be transformed to the corresponding
internal representation used by the application.</para>
</section>

<section id="sec_addnfiles">
<title>Additional Files</title>



<sidebar revision="1999/06/23" author="OC"><para>This is new</para></sidebar>
<para>Several additional files are related to Content Dictionaries.
Signature files contain the signatures of symbols defined in some &OM; 
Content Dictionary and their format is endorsed by this standard.</para>

<para>Furthermore, the standard fixes how to define as a CDGroup a specific
set of Content Dictionaries.</para>

<para>Auxiliary files that define presentation and rendering or that are
used for manipulating and processing Content Dictionaries are not
discussed by the standard.</para>

<sidebar revision="1999/10/01" author="OC"><para>Removed mention to DefMP files</para></sidebar>
</section>
<section id="sec_phrasebooks">
<title>Phrasebooks</title>



<para>The conversion of an &OM; object to/from the internal representation
in a software application is performed by an interface program called
<emphasis>Phrasebook</emphasis>. The translation is governed by the Content
Dictionaries and the specifics of the application. It is envisioned
that a software application dealing with a specific area of
mathematics declares which Content Dictionaries it understands. As a
consequence, it is expected that the Phrasebook of the application is
able to translate &OM; objects built using symbols from these Content
Dictionaries to/from the internal mathematical objects of the
application.</para>

<sidebar revision="2000/04/10" author="DPC"><para>Reword</para></sidebar>
<para>&OM; objects do not specify any compuational behaviour,
they merely represent mathematical expressions.
Part of the &OM; philosophy is to leave it to the application to
decide what it does with an object once it has received it.  &OM; is
not a query or programming language. Because of this, &OM; does not
prescribe a way of forcing <quote>evaluation</quote> or <quote>simplification</quote> of
objects like <math><mn>2</mn><mo>+</mo><mn>3</mn></math> or <math><mi>sin</mi><mo>(</mo><mi>&#960;</mi><mo>)</mo></math>. Thus, the same object <math><mn>2</mn><mo>+</mo><mn>3</mn></math> could
be transformed to <math><mn>5</mn></math> by a computer algebra system, or displayed as
<math><mn>2</mn><mo>+</mo><mn>3</mn></math> by a typesetting tool.</para>
</section>
</chapter>

<chapter id="cha_obj">
<title>&OM; Objects</title>


<sidebar revision="1999/08/24" author="OC">
<para>Reshuffled the sections on OM Objects</para>
</sidebar>
<para>In this chapter we provide a self-contained description of &OM; 
objects. We first do so at an informal level (<xref linkend="sec_omin"/>)
and next by means of an abstract grammar description
(<xref linkend="sec_omabs"/>).</para>


<section id="sec_omabs">
<title>Formal Definition of &OM; Objects</title>

<sidebar revision="1999/07/16" author="DPC">
<para>Restructure the definition of OM Objects</para>
</sidebar>
<para>&OM; represents mathematical objects as terms or as labelled trees
that are called &OM; objects or &OM; expressions. The definition of
an abstract &OM; object is then the following.</para>



<section id="sec_basic">
<title>Basic &OM; objects</title>
<para>The Basic &OM; Objects form the leaves of the &OM; Object tree.
A  Basic &OM; Object is of one of the following.</para>
<sidebar revision="1999/09/10" author="DPC"><para>Expand descriptions of basic objects</para></sidebar>
<itemizedlist>
<listitem><para><phrase>(i)</phrase>  Integer.</para>
  <para>Integers in the mathematical sense, with no predefined range.
  They are <quote>infinite precision</quote> integers (also called <quote>bignums</quote> in
  computer algebra).</para>

</listitem>
<listitem><para><phrase>(ii)</phrase> IEEE  floating point number.</para>
    <para>Double precision floating-point numbers following the
    <acronym>ieee</acronym> 754-1985 standard&#160;<citation>ieee754_85</citation>.</para>

</listitem>
<listitem><para><phrase>(iii)</phrase> Character string.</para>

 <para>A Unicode Character string. This also corresponds to `characters' in
  &exml;.</para>

</listitem>
<listitem><para><phrase>(iv)</phrase> Bytearray.</para>

 <para>A sequence of bytes.</para>

</listitem>
<listitem><para><phrase>(v)</phrase>  Symbol.</para>
<para>A Symbol encodes two fields of information, a <emphasis>name</emphasis> and a
<emphasis>Content Dictionary</emphasis>. Each is a sequence of characters matching a
regular expression, as described below.</para>  
</listitem>
<listitem><para><phrase>(vi)</phrase> Variable.</para>


<para>A Variable consists of a <emphasis>name</emphasis> which is a sequence of characters
 matching a regular expression, as described below.</para>

</listitem>
</itemizedlist>
</section>

<section id="sec_compound">
<title>Compound &OM; Objects</title>
  
<para>&OM; objects are built recursively as follows.
<itemizedlist>
<listitem><para><phrase>(i)</phrase> Basic &OM; objects are &OM; objects.</para>
  
</listitem>
<listitem><para><phrase>(ii)</phrase> If <math><msub><mi>A</mi><mn>1</mn></msub></math>, &#8230;, <math><msub><mi>A</mi><mi>n</mi></msub></math> <math><mo>(</mo><mi>n</mi><mo>&gt;</mo><mn>0</mn><mo>)</mo></math> are &OM; objects, then
  <math display="block">
  <mi mathvariant="bold">application</mi><mo>(</mo><msub><mi>A</mi><mn>1</mn></msub><mo>,</mo> <mi>&#8230;</mi><mo>,</mo> <msub><mi>A</mi><mi>n</mi></msub><mo>)</mo>
  </math>
  is an &OM; <emphasis>application object</emphasis>.</para>
    
<sidebar revision="1999/08/24" author="OC"><para>Cleaned up Attribution</para></sidebar>
</listitem>
<listitem><para><phrase>(iii)</phrase> If
  <math><msub><mi>S</mi><mn>1</mn></msub><mo>,</mo>
  <mi>&#8230;</mi><mo>,</mo> <msub><mi>S</mi><mi>n</mi></msub></math>
  are &OM; symbols, and <math><mi>A</mi></math>,
  <math><msub><mi>A</mi><mn>1</mn></msub></math>,
  &#8230;, <math><msub><mi>A</mi><mi>n</mi></msub></math>, <math><mo>(</mo><mi>n</mi><mo>&gt;</mo><mn>0</mn><mo>)</mo></math> are &OM; objects, then
  <math display="block"><mi mathvariant="bold">attribution</mi>
  <mo>(</mo><mi>A</mi><mo>,</mo> <msub><mi>S</mi><mn>1</mn></msub>
  <mspace width=".3em"/>
  <msub><mi>A</mi><mn>1</mn></msub><mo>,</mo>
 <mspace width=".3em"/> <mi>&#8230;</mi> <mspace width=".3em"/> <mo>,</mo> <msub><mi>S</mi><mi>n</mi></msub> <mspace width=".3em"/>
  <msub><mi>A</mi><mi>n</mi></msub><mo>)</mo></math>
  is an &OM; <emphasis>attribution object</emphasis> and <math><mi>A</mi></math> is the object
  <emphasis>stripped of attributions</emphasis>. The operation of recursively
  applying stripping to the stripped object is called <emphasis>flattening
    of the attribution</emphasis>. When  the stripped object after flattening
  is a  variable, the attributed object is called <emphasis>attributed
    variable</emphasis>.</para>

</listitem>
<listitem><para><phrase>(iv)</phrase> If <math><mi>B</mi></math> and <math><mi>C</mi></math> are &OM; objects, and <math><msub><mi>v</mi><mn>1</mn></msub></math>, <math><mi>&#8230;</mi></math>,
  <math><msub><mi>v</mi><mi>n</mi></msub></math> <math><mo>(</mo><mi>n</mi> <mo>&#8805;</mo> <mn>0</mn><mo>)</mo></math> are &OM; variables or attributed variables, then
  <math display="block">
  <mi mathvariant="bold">binding</mi> <mo>(</mo><mi>B</mi><mo>,</mo> <msub><mi>v</mi><mn>1</mn></msub><mo>,</mo> <mi>&#8230;</mi><mo>,</mo> <msub><mi>v</mi><mi>n</mi></msub><mo>,</mo> <mi>C</mi><mo>)</mo>
  </math>
  is an &OM; <emphasis>binding object</emphasis>.</para>

</listitem>
<listitem><para><phrase>(v)</phrase> If <math><mi>S</mi></math> is an
&OM; symbol and <math><msub><mi>A</mi><mn>1</mn></msub></math>, &#8230;, <math><msub><mi>A</mi><mi>n</mi></msub></math> <math><mo>(</mo><mi>n</mi> <mo>&#8805;</mo> <mn>0</mn><mo>)</mo></math> are &OM; objects, then
  <math display="block"><mi mathvariant="bold">error</mi> <mo>(</mo><mi>S</mi><mo>,</mo> <msub><mi>A</mi><mn>1</mn></msub><mo>,</mo><mi>&#8230;</mi><mo>,</mo><msub><mi>A</mi><mi>n</mi></msub><mo>)</mo>
  </math>
  is an &OM; <emphasis>error object</emphasis>.</para>
</listitem>
</itemizedlist>
</para>
</section>
</section>

<section id="sec_omin">
<title>Further Description of &OM; Objects</title>


<sidebar revision="1999/08/24" author="OC"><para>Condensed Informal and Notes</para></sidebar>
  
<sidebar revision="2000/04/10" author="DPC">
<para>Add integer and float</para>
</sidebar>
<para>Informally, an &OM; <phrase role="sl">object</phrase> can be viewed as a tree and is also
referred to as a term.  The objects at the leaves of &OM; trees are
called <phrase role="sl">basic objects</phrase>.  The basic objects supported by &OM; are:
<variablelist>
<varlistentry><term>Integer</term><listitem><para>Arbitrary Precision integers.</para>
</listitem></varlistentry>
<varlistentry><term>Float</term><listitem>
    <para>&OM; floats are  <acronym>ieee</acronym> 754 Double precision floating-point
    numbers. Other types of floating point number may be encoded
    in &OM; by the use of suitable content dictionaries.</para>
  
</listitem>
</varlistentry>
<varlistentry><term>Character strings</term><listitem><para>are sequences of characters. These characters
  come from the Unicode standard&#160;<citation>UNICODE</citation>.</para>
  
</listitem></varlistentry>
<varlistentry><term>Bytearrays</term><listitem><para>are sequences of bytes. There is no <quote>byte</quote> in &OM; 
  as an object of its own. However, a single byte can of course be
  represented by a bytearray of length 1.  The difference between
  strings and bytearrays is the following: a character string is a
  sequence of bytes with a fixed interpretation (as characters,
  Unicode texts may require several bytes to code one character),
  whereas a bytearray is an uninterpreted sequence of bytes with no
  intrinsic meaning.  Bytearrays could be used inside &OM; errors to
  provide information to, for example, a debugger; they could also
  contain intermediate results of calculations, or `handles' into
  computations or databases.</para>
</listitem>
</varlistentry>
<varlistentry><term>Symbols</term><listitem>
  <sidebar revision="2000/04/10" author="DPC"><para>Change Example</para></sidebar>
  <sidebar revision="1999/09/10" author="DPC"><para>Remove ' from regexp</para></sidebar>
  <para>
 are uniquely defined by the Content Dictionary in which
  they occur and by a name. In definition in <xref linkend="sec_omabs"/>
  we have left this information implicit. However, it should be kept
  in mind that all symbols appearing in an &OM; object are defined in
  a Content Dictionary. The form of these definitions is explained in
  <xref linkend="cha_cd"/>.  Each symbol has no more than one definition
  in a Content Dictionary. Many Content Dictionaries may define
  differently a symbol with the same name (e.g., the symbol
  <systemitem>union</systemitem> is defined as associative-commutativeset theoretic
  union in a  Content Dictionary <systemitem>set1</systemitem> but
  another Content Dictionary, <systemitem>multiset1</systemitem> might define
  a symbol <systemitem>union</systemitem> as the union of multi-sets.
  The name of a symbol can only contain alphanumeric
  characters and underscores.  More precisely, a symbol name matches
  the following regular expression:
  <blockquote><para>
 [<systemitem>A</systemitem>-<systemitem>Z</systemitem><systemitem>a</systemitem>-<systemitem>z</systemitem>]
  [<systemitem>A</systemitem>-<systemitem>Z</systemitem><systemitem>a</systemitem>-<systemitem>z</systemitem><systemitem>0</systemitem>-<systemitem>9</systemitem><systemitem>_</systemitem>]*    
  </para></blockquote></para>
  
  <para>Notice that these symbol names are case sensitive.  &OM;
  <emphasis>recommends</emphasis> that symbol names should be no longer than
  100 characters.</para>
 <sidebar revision="1999/09/10" author="DPC"><para>Removed suggestion to utf7 hint variable names</para></sidebar>
  </listitem>
</varlistentry>
<varlistentry><term>Variables</term><listitem><para>are meant to denote parameters, variables or
  indeterminates (such as bound variables of function definitions,
  variables in summations and integrals, independent variables of
  derivatives).  Plain variable names are restricted to use a subset
  of the printable ASCII characters.  Formally the names must
  match the regular expression:
  <blockquote><para>
  [A-Za-z0-9=+(),-./:?!#$%*;=@[]^_`{|}]+  
  </para></blockquote></para>
   
</listitem>
</varlistentry>
</variablelist> </para>

<para>The four following constructs can be used to make compound &OM;
objects.</para>

<variablelist>
<varlistentry><term>Application</term><listitem><para>constructs an &OM; object from a sequence of one or
  more &OM; objects. The first argument of application is referred to
  as <quote>head</quote> while the remaining objects are called <quote>arguments</quote>.
  An &OM; application object can be used to convey the mathematical
  notion of application of a function to a set of arguments.
  For instance, suppose that the &OM; symbol <systemitem>sin</systemitem> is
  defined in a Content Dictionary for trigonometry, then
  <math><mi mathvariant="bold">application</mi><mo>(</mo><mi>sin</mi><mo>,</mo> <mi>x</mi> <mo>)</mo></math> is the abstract &OM; object
  corresponding to <math><mi>sin</mi> <mo>(</mo><mi>x</mi> <mo>)</mo></math>.  More generally, an &OM; application
  object can be used as a constructor to convey a mathematical object
  built from other objects such as a polynomial constructed from a set
  of monomials.  Constructors build inhabitants of some symbolic type,
   for instance the type of rational numbers or the type of
  polynomials.  The rational number, usually denoted as <math><mn>1</mn><mo>/</mo><mn>2</mn></math>, is
  represented by the &OM; application object
  <math><mi mathvariant="bold">application</mi><mo>(</mo><mi>Rational</mi><mo>,</mo> <mn>1</mn><mo>,</mo> <mn>2</mn><mo>)</mo></math>. The symbol
  <systemitem>Rational</systemitem> must be defined, by a Content Dictionary, as a
  constructor symbol for the rational numbers.</para>
   
<figure id="fig_obj">
    <title>The &OM; application and binding objects for <math><mi>sin</mi> <mo>(</mo><mi>x</mi> <mo>)</mo></math> and <math><mi>&#955;</mi> <mi>x</mi><mo>.</mo><mi>x</mi> <mo>+</mo> <mn>2</mn></math> in tree-like notation.</title>
<sidebar revision="1999/10/21" author="OC"><para>New tree figure, suggested by Andreas Strotmann</para></sidebar>
 <graphic fileref="lambda" width="600" depth="190"/>
</figure>

  
</listitem>
</varlistentry>
<varlistentry><term>Binding</term><listitem><para>objects are constructed from an &OM; object, and from a
  sequence of zero or more variables followed by another &OM; object.
  The first &OM; object is the <quote>binder</quote> object. Arguments 2 to <math><mi>n</mi><mo>-</mo><mn>1</mn></math>
  are always variables to be bound in the <quote>body</quote> which is the
  <math><msup><mi>n</mi><mi>th</mi></msup></math> argument object. It is allowed to have no bound variables,
  but the binder object and the body should be present. Binding can be
  used to express functions or logical statements.  The function
  <math><mi>&#955;</mi> <mi>x</mi><mo>.</mo><mi>x</mi> <mo>+</mo><mn>2</mn></math>, in which the variable <math><mi>x</mi></math> is bound by <math><mi>&#955;</mi></math>,
  corresponds to a binding object having as binder the &OM; symbol
  <systemitem>lambda</systemitem>:
  <math display="block"><mi mathvariant="bold">binding</mi><mo>(</mo><mi>lambda</mi><mo>,</mo> <mi>x</mi> <mo>,</mo>
  <mi mathvariant="bold">application</mi><mo>(</mo><mi>plus</mi><mo>,</mo>
  <mi>x</mi> <mo>,</mo> <mn>2</mn><mo>)</mo><mo>)</mo><mtext>.</mtext></math></para>
  
  
  <para>Binding of several variables as in:
  <math display="block"><mi mathvariant="bold">binding</mi> <mo>(</mo><mi>B</mi><mo>,</mo> <msub><mi>v</mi><mn>1</mn></msub><mo>,</mo> <mi>&#8230;</mi><mo>,</mo> <msub><mi>v</mi><mi>n</mi></msub><mo>,</mo> <mi>C</mi> <mo>)</mo></math>
  is semantically
  equivalent to composition of binding of a single variable, namely
  <math display="block"><mi mathvariant="bold">binding</mi> <mo>(</mo><mi>B</mi> <mo>,</mo> <msub><mi>v</mi><mn>1</mn></msub><mo>,</mo><mo>(</mo><mi mathvariant="bold">binding</mi> <mo>(</mo><mi>B</mi> <mo>,</mo> <msub><mi>v</mi><mn>2</mn></msub><mo>,</mo> <mo>(</mo><mi>&#8230;</mi><mo>,</mo>
  <mi mathvariant="bold">binding</mi> <mo>(</mo><mi>B</mi> <mo>,</mo> <msub><mi>v</mi><mi>n</mi></msub><mo>,</mo> <mi>C</mi><mo>)</mo> <mi>&#8230;</mi> <mo>)</mo><mtext>.</mtext></math></para>
  
  
<sidebar revision="1999/10/04" author="DPC"><para>Rephrase slightly</para></sidebar>
<para>Note that it follows from this that repeated occurences
  of the same variable in a binding operator are allowed. For example
  the object
  <math display="block"><mi mathvariant="bold">binding</mi> <mo>(</mo><mi>lambda</mi><mo>,</mo> <mi>v</mi> <mo>,</mo> <mi>v</mi> <mo>,</mo><mi mathvariant="bold">application</mi> <mo>(</mo><mi>times</mi><mo>,</mo><mi>v</mi> <mo>,</mo><mi>v</mi><mo>)</mo> <mo>)</mo></math>
  is semantically equivalent
  to:
  <math display="block"><mi mathvariant="bold">binding</mi><mo>(</mo><mi>lambda</mi><mo>,</mo> <mi>v</mi><mo>,</mo> <mi mathvariant="bold">binding</mi> <mo>(</mo><mi>lambda</mi><mo>,</mo> <mi>v</mi><mo>,</mo><mi mathvariant="bold">application</mi><mo>(</mo><mi>times</mi><mo>,</mo><mi>v</mi><mo>,</mo><mi>v</mi><mo>)</mo><mo>)</mo><mo>)</mo></math>
  so that the outermost
  binding is actually a constant function (<math><mi>v</mi></math> does not occur free in
  the body <math><mi mathvariant="bold">application</mi> <mo>(</mo><mi>times</mi><mo>,</mo><mi>v</mi> <mo>,</mo><mi>v</mi><mo>)</mo> <mo>)</mo></math>).</para>

<para>Phrasebooks are allowed to use <math><mi>&#945;</mi></math> conversion in order to avoid
  clashes of variable names. Suppose an object <math><mi>&#937;</mi></math> contains an
  occurrence of the object <math><mi mathvariant="bold">binding</mi> <mo>(</mo><mi>B</mi> <mo>,</mo> <mi>v</mi> <mo>,</mo> <mi>C</mi> <mo>)</mo></math>.  This
  object <math><mi mathvariant="bold">binding</mi> <mo>(</mo><mi>B</mi> <mo>,</mo> <mi>v</mi> <mo>,</mo> <mi>C</mi> <mo>)</mo></math> can be replaced in <math><mi>&#937;</mi></math>
  by <math><mi mathvariant="bold">binding</mi> <mo>(</mo><mi>B</mi> <mo>,</mo> <mi>z</mi> <mo>,</mo> <mi>C'</mi><mo>)</mo></math> where <math><mi>z</mi></math> is a variable not
  occurring free in <math><mi>C</mi></math> and <math><mi>C'</mi></math> is obtained from <math><mi>C</mi></math> by replacing
  each free (i.e., not bound by any intermediate <varname>binding</varname>
  construct)
  occurrence of <math><mi>v</mi></math> by <math><mi>z</mi></math>.  This operation preserves the
  semantics of the object <math><mi>&#937;</mi></math>. In the above example, a phrasebook
  is thus allowed to transform the object to, e.g.
  <math display="block"><mi mathvariant="bold">binding</mi> <mo>(</mo><mi>lambda</mi><mo>,</mo> <mi>v</mi> <mo>,</mo> <mi mathvariant="bold">binding</mi> <mo>(</mo><mi>lambda</mi><mo>,</mo> <mi>z</mi> <mo>,</mo><mi mathvariant="bold">application</mi> <mo>(</mo><mi>times</mi><mo>,</mo><mi>z</mi> <mo>,</mo><mi>z</mi><mo>)</mo><mo>)</mo><mo>)</mo><mtext>.</mtext></math></para>

  
</listitem>
</varlistentry>
<varlistentry><term>Attribution</term><listitem><para>decorates an object with a sequence of one or more
  pairs made up of an &OM; symbol, the <quote>attribute</quote>, and an
  associated &OM; object, the <quote>value of the attribute</quote>.  The value
  of the attribute can be an attribution object itself. As example of
  this, consider the &OM; objects representing groups, automorphism
  groups, and group dimensions. It is then possible to attribute an
  &OM; object representing a group by its automorphism group, itself
  attributed by its dimension.</para>

<para>Composition of attributions, as in
  <math display="block">
<mi mathvariant="bold">attribution</mi><mo>(</mo><mi mathvariant="bold">attribution</mi><mo>(</mo><mi>A</mi><mo>,</mo> <msub><mi>S</mi><mn>1</mn></msub> <mspace width=".3em"/>
  <msub><mi>A</mi><mn>1</mn></msub><mo>,</mo><mi>&#8230;</mi><mo>,</mo><msub><mi>S</mi><mi>h</mi></msub> <mspace width=".3em"/> <msub><mi>A</mi><mi>h</mi></msub><mo>)</mo><mo>,</mo> <msub><mi>S</mi><mrow><mi>h</mi><mo>+</mo><mn>1</mn></mrow></msub> <mspace width=".3em"/> <msub><mi>A</mi><mrow><mi>h</mi><mo>+</mo><mn>1</mn></mrow></msub><mo>,</mo> <mi>&#8230;</mi><mo>,</mo> <msub><mi>S</mi><mi>n</mi></msub> <mspace width=".3em"/> <msub><mi>A</mi><mi>n</mi></msub><mo>)</mo></math>
  is
  semantically equivalent to a single attribution, that is
  <math display="block"><mi mathvariant="bold">attribution</mi><mo>(</mo><mi>A</mi><mo>,</mo> <msub><mi>S</mi><mn>1</mn></msub> <mspace width=".3em"/> <msub><mi>A</mi><mn>1</mn></msub><mo>,</mo> <mi>&#8230;</mi><mo>,</mo> <msub><mi>S</mi><mi>h</mi></msub> <mspace width=".3em"/> <msub><mi>A</mi><mi>h</mi></msub><mo>,</mo>
  <msub><mi>S</mi><mrow><mi>h</mi><mo>+</mo><mn>1</mn></mrow></msub>
  <mspace width=".3em"/>
  <msub><mi>A</mi><mrow><mi>h</mi><mo>+</mo><mn>1</mn></mrow></msub><mo>,</mo> <mi>&#8230;</mi><mo>,</mo> <msub><mi>S</mi><mi>n</mi></msub> <mspace width=".3em"/> <msub><mi>A</mi><mi>n</mi></msub><mo>)</mo><mtext>.</mtext></math>
  The operation that 
  produces an object with a single layer of attribution is called
 <systemitem>flattening</systemitem>.</para>

<para>Multiple attributes with the same name are allowed.  While the order
  of the given attributes does not imply any notion of priority,
  potentially it could be significant. For instance, consider the case
  in which <math><msub><mi>S</mi><mi>h</mi></msub> <mo>=</mo>
  <msub><mi>S</mi><mi>n</mi></msub></math>
 (<math><mi>h</mi> <mo>&lt;</mo> <mi>n</mi></math>) in the example above. Then, the
  object is to be interpreted as if the value <math><msub><mi>A</mi><mi>n</mi></msub></math> overwrites the
  value <math><msub><mi>A</mi><mi>h</mi></msub></math>.  (&OM; however does not mandate that an application
  preserves the attributes or their order.)</para>

<sidebar revision="1999/08/24" author="OC">
<para>Removed reference to syntactic class of an attributed variable</para>
</sidebar>
<para>Objects can be decorated in a multitude of ways. In&#160;<citation>OM_D131b</citation>,
typing of &OM; objects is expressed by
using an attribution. The object <math><mi mathvariant="bold">attribution</mi><mo>(</mo><mi>A</mi><mo>,</mo>
<mi>type</mi> <mspace width=".3em"/> <mi>t</mi> <mo>)</mo></math> represents the judgment stating that object <math><mi>A</mi></math>
has type <math><mi>t</mi></math>. Note that both <math><mi>A</mi></math> and <math><mi>t</mi></math> are &OM; objects.</para>

<para>Attribution can act as either annotation, in the sense of adornment,
  or as modifier. In the former case, replacement of the adorned
  object by the object itself is probably not harmful (preserves the
  semantics). In the latter case however, it may very well be.
  Therefore, attribution in general should by default be treated as a
  construct rather than as adornment. Only when the CD definitions of
  the attributes make it clear that they are adornments, can the
  attributed object be viewed as semantically equivalent to the
  stripped object.</para>

  
</listitem>
</varlistentry>
<varlistentry><term>Error</term><listitem><para>is made up of an &OM; symbol and a sequence of zero or
  more &OM; objects. This object has no direct mathematical meaning.
  Errors occur as the result of some treatment on an &OM; object and
  are thus of real interest only when some sort of communication is
  taking place. Errors may occur inside other objects and also inside
  other errors.  Error objects might consist only of a symbol as in
  the object: <math><mi mathvariant="bold">error</mi> <mo>(</mo><mi>S</mi> <mo>)</mo></math>.</para>
<sidebar revision="1999/09/22" author="DPC"><para>Remove classification of suggested error types, does not fit current CD
  scheme</para></sidebar>
</listitem>
</varlistentry>
</variablelist> 
</section>

<section id="sec_summary">
<title>Summary</title>

<itemizedlist>
<listitem> <para>&OM; supports basic objects like integers, symbols,
  floating-point numbers, character strings, bytearrays, and
  variables.</para>
</listitem>
<listitem> <para>&OM; compound objects are of four kinds: applications, bindings,
  errors, and attributions.</para>
</listitem>
<listitem> <para>&OM; objects have the expressive power to cover all areas of
  computational mathematics.</para>
</listitem>
</itemizedlist>

<sidebar revision="1999/09/22" author="DPC"><para>Paragraph moved from previous section</para></sidebar>
<para>Observe that an &OM; application object is viewed as a <quote>tree</quote> by
software applications that do not understand Content Dictionaries,
whereas a Phrasebook that understands the semantics of the symbols, as
defined in the Content Dictionaries, should interpret the object as
functional application, constructor, or binding accordingly. Thus, for
example, for some applications, the &OM; object corresponding
to <math><mn>2</mn><mo>+</mo><mn>5</mn></math> may result in a command that writes <math><mn>7</mn></math>.</para>
</section>
</chapter>

<chapter id="cha_enco">
<title>&OM; Encodings</title>


<para>In this chapter, two encodings are defined that map between  &OM;
objects and byte streams.  These byte streams constitute a low level
representation that can be easily exchanged between processes (via
almost any communication method) or stored and retrieved from files.</para>


<para>The first encoding uses  ISO 646:1983 characters&#160;<citation>iso646_83</citation>
(also known as <acronym>ascii</acronym> characters) and is  an &exml;
application. Although the &exml; markup of the encoding uses only <acronym>ascii</acronym>
characters, OpenMath strings may use
arbitrary Unicode/ISO 10646:1988 
characters&#160;<citation>UNICODE</citation>. 
It can be used, for example, to send &OM; objects via
e-mail, news, cut-and-paste, etc. The texts produced by this encoding
can be part of &exml; documents.</para>

<para>The second encoding is a binary encoding that is meant to be used when 
the compactness of the encoding is important (interprocess communications 
over a network is an example).</para>

<para>Note that these two encodings are sufficiently different for
autodetection to be effective: an application reading the bytes can
very easily determine which encoding is used.</para>

<section id="sec_xml">
<title>The &exml; Encoding</title>

<para>This encoding has been designed with two main goals in mind:
<orderedlist>
<listitem><para>to provide an encoding that uses the most common character set
  (so that it can be easily included in most documents and transport
  protocols) and that is both readable and writable by a human.</para>
</listitem>
<listitem><para>to provide an encoding that can be included (embedded) in
  &exml; documents.</para>
</listitem>
</orderedlist> 
</para>

<section id="ssec_xml">
<title>A Grammar for the &exml; Encoding</title>


<sidebar revision="1999/09/09" author="DPC"><para>Modify description of XML encoding to make 
    <acronym>dtd</acronym>
     normative, and other changes to increase portability to &exml;
applications.</para></sidebar>



<para>The &exml; encoding of an OpenMath object is defined by the <acronym>dtd</acronym> given
in <xref linkend="fig_objdtd"/> below, with the following additional rules
not implied by the &exml; <acronym>dtd</acronym>.</para>
<itemizedlist>
<listitem>
<para>Comments are permitted only between
elements, not within element character data.</para>

</listitem>
<listitem> <para>Processing Instructions are only allowed before the <acronym>OMOBJ</acronym>
 element.</para>
 
</listitem>
<listitem>
<para>The content of an <acronym>OMB</acronym> element, is a valid base64-encoded text.</para>

</listitem>
<listitem> 
<para>The character data forming element content and attribute values
matches the regular expressions of <xref linkend="fig_xml"/>.</para>
</listitem>
</itemizedlist>


<figure id="fig_objdtd">
    <title>DTD for the &OM; &exml; encoding of objects.</title>
<literallayout><![CDATA[
<!-- DTD for OM Objects - sb 29.10.98 -->
<!-- sb 3.2.99 -->

<!--
     general list of embeddable elements
      : excludes OMATP as this is only embeddable in OMATTR
      : excludes OMBVAR as this is only embeddable in OMBIND
-->

<!ENTITY % omel "OMS | OMV | OMI | OMB | OMSTR
                                | OMF | OMA | OMBIND | OME
                                | OMATTR ">

<!-- things which can be variables -->

<!ENTITY % omvar "OMV | OMATTR" >

<!-- symbol -->
<!ELEMENT OMS EMPTY>
<!ATTLIST OMS name CDATA #REQUIRED
                          cd CDATA #REQUIRED >

<!-- variable -->
<!ELEMENT OMV EMPTY>
<!ATTLIST OMV name CDATA #REQUIRED >

<!-- integer -->
<!ELEMENT OMI (#PCDATA) >

<!-- byte array -->
<!ELEMENT OMB (#PCDATA) >

<!-- string -->
<!ELEMENT OMSTR (#PCDATA) >

<!-- floating point -->
<!ELEMENT OMF EMPTY>
<!ATTLIST OMF dec CDATA #IMPLIED
                          hex CDATA #IMPLIED>

<!-- apply constructor -->
<!ELEMENT OMA (%omel;)+ >

<!-- binding constructor & variable -->
<!ELEMENT OMBIND ((%omel;), OMBVAR, (%omel;)) >
<!ELEMENT OMBVAR (%omvar;)+ >

<!-- error -->
<!ELEMENT OME (OMS, (%omel;)* ) >

<!-- attribution constructor & attribute pair constructor -->
<!ELEMENT OMATTR (OMATP, (%omel;)) >
<!ELEMENT OMATP (OMS, (%omel;))+ >

<!-- OM object constructor -->
<!ELEMENT OMOBJ (%omel;) >]]>
</literallayout>
</figure>


<para>In addition, if the &exml; document encoding the &OM; object is
linearised into the &exml; concrete syntax, the following further
constraints apply, which ensure thet the encoding may be read by &OM;
applications that may not include a full &exml; parser.</para>
<sidebar revision="1999/09/09" author="DPC"><para>Restrictions on not using foo='xxxx' dropped</para></sidebar>
<itemizedlist>
<listitem>
<para>The document should use <acronym>utf-8</acronym> encoding.</para>

</listitem>
<listitem>
<para>Entity and character references should not be used.</para>

</listitem>
<listitem>
<para>A <systemitem>&lt;!DOCTYPE</systemitem> declaration should not be used.</para>

</listitem>
<listitem>
<sidebar revision="1999/09/21" author="DPC"><para>Restrict empty element syntax</para></sidebar>
<para>The &exml; empty element form <systemitem>&lt;|&#8230;/&gt;</systemitem> should always be
used to encode elements such as <acronym>omf</acronym> which are specified in the
<acronym>dtd</acronym> as
being <acronym>empty</acronym>. It should never be used for elements that may sometimes be
empty, such as <acronym>omstr</acronym>.</para>

</listitem>
</itemizedlist>

<para>Such a linearisation of an &exml; encoded &OM; Object would match the
match the character based grammar given in <xref linkend="fig_xml"/>.</para>

<para>The notation used in this section and in <xref linkend="fig_xml"/> should
be quite straightforward (+ meaning <quote>one or more</quote>, ? meaning zero or
one, and <math><mi>|</mi></math> meaning <quote>or</quote>). The start symbol of the grammar is
<quote>start</quote>, <quote>space</quote> stands for the space character, <quote>cr</quote> for the
carriage return character, <quote>nl</quote> for the line feed character and
<quote>tab</quote> for the horizontal tabulation character.</para>


<figure id="fig_xml">
    <title>Grammar for the &exml; encoding of &OM; objects.</title>

<sidebar revision="1999/07/16" author="DPC"><para>White space allowed in integer strings</para></sidebar>

<informaltable>
<tgroup cols="3">
<tbody>
<row>
<entry>
S           </entry><entry> <math>&longrightarrow;</math> </entry><entry> (space<math><mi>|</mi></math>tab<math><mi>|</mi></math>cr<math><mi>|</mi></math>nl)+  
</entry>
</row>
<row>
<entry>


integer     </entry><entry> <math>&longrightarrow;</math> </entry><entry> 
        (<systemitem>-</systemitem> S?)? [&digits;]+ (S [&digits;]+)*  <math><mi>|</mi></math> 
        (<systemitem>-</systemitem> S?)? <systemitem>x</systemitem> S? [&exadigits;]+ (S [&exadigits;]+)* 
</entry>
</row>
<row>
<entry>
 

cdname      </entry><entry> <math>&longrightarrow;</math> </entry><entry>  [&lcalpha;][&lcalpha;&digits;<systemitem>_</systemitem>]*
</entry>
</row>
<row>
<entry>


symbname    </entry><entry> <math>&longrightarrow;</math> </entry><entry> [&ucalpha;&lcalpha;][&ucalpha;&lcalpha;&digits;<systemitem>_</systemitem>]*
</entry>
</row>
<row>
<entry>


fpdec       </entry><entry> <math>&longrightarrow;</math> </entry><entry>  
    (<systemitem>-</systemitem>?)([&digits;]+)?(<systemitem>.</systemitem>[&digits;]+)?(<systemitem>e</systemitem>(&sign;?)[&digits;]+)?
</entry>
</row>
<row>
<entry>


fphex       </entry><entry> <math>&longrightarrow;</math> </entry><entry>  [&digits;ABCDEF]+ 
</entry>
</row>
<row>
<entry>


varname        </entry><entry> <math>&longrightarrow;</math> </entry><entry> ([&ucalpha;&lcalpha;&digits;&varnamechar;])+ 
</entry>
</row>
<row>
<entry>


base64      </entry><entry> <math>&longrightarrow;</math> </entry><entry> ([&ucalpha;&lcalpha;&digits; +/=] <math><mi>|</mi></math> S)+ 
</entry>
</row>
<row>
<entry>

vv
char  </entry><entry> <math>&longrightarrow;</math> </entry><entry> <emphasis>XML Character Data</emphasis>
</entry>
</row>
</tbody>
</tgroup>
</informaltable>
<sidebar revision="1999/09/09" author="DPC"><para>removed ' from varname</para></sidebar>



<informaltable>
<tgroup cols="3">
<tbody>
<row>
<entry>
symbnameatt</entry><entry> <math>&longrightarrow;</math></entry><entry> 
    <systemitem>name</systemitem> S? = S? (<systemitem>"</systemitem> symbname <systemitem>"</systemitem> <systemitem>|</systemitem> <systemitem>'</systemitem> symbname <systemitem>'</systemitem>) 
</entry>
</row>
<row>
<entry>

cdnameatt</entry><entry> <math>&longrightarrow;</math></entry><entry>
 <systemitem>cd</systemitem> S? = S? (<systemitem>"</systemitem> cdname <systemitem>"</systemitem> <systemitem>|</systemitem> <systemitem>'</systemitem> cdname <systemitem>'</systemitem>) 
</entry>
</row>
<row>
<entry>

varnameatt</entry><entry> <math>&longrightarrow;</math></entry><entry>
 <systemitem>name</systemitem> S? = S? (<systemitem>"</systemitem> varname <systemitem>"</systemitem> <systemitem>|</systemitem> <systemitem>'</systemitem> varname <systemitem>'</systemitem>) 
</entry>
</row>
<row>
<entry>

fpdecatt</entry><entry> <math>&longrightarrow;</math></entry><entry>
 <systemitem>dec</systemitem> S? = S? (<systemitem>"</systemitem> fpdec <systemitem>"</systemitem> <systemitem>|</systemitem> <systemitem>'</systemitem> fpdec <systemitem>'</systemitem>) 
</entry>
</row>
<row>
<entry>

fphexatt</entry><entry> <math>&longrightarrow;</math></entry><entry>
 <systemitem>hex</systemitem> S? = S? (<systemitem>"</systemitem> fphex <systemitem>"</systemitem> <systemitem>|</systemitem> <systemitem>'</systemitem> fphex <systemitem>'</systemitem>) 
</entry>
</row>
<row>
<entry>

PI </entry><entry> <math>&longrightarrow;</math></entry><entry> &lt;<systemitem>?</systemitem> char <systemitem>?</systemitem><systemitem>></systemitem>
</entry>
</row>
<row>
<entry>

comment</entry><entry> <math>&longrightarrow;</math></entry><entry> &lt;<systemitem>!-&zsp;-</systemitem> char <systemitem>-&zsp;-</systemitem><systemitem>></systemitem>
</entry>
</row>
<row>
<entry>


SC</entry><entry><math>&longrightarrow;</math></entry><entry> S+ <systemitem>|</systemitem> (comment S)+
</entry>
</row>
<row>
<entry>


start  </entry><entry> <math>&longrightarrow;</math> </entry><entry> 
 (SC <systemitem>|</systemitem> PI)* <systemitem>&lt;OMOBJ</systemitem> S?<systemitem>&gt;</systemitem> S? object S? <systemitem>&lt;/OMOBJ</systemitem> S?<systemitem>&gt;</systemitem> 
</entry>
</row>
<row>
<entry>


symbol   </entry><entry> <math>&longrightarrow;</math> </entry><entry> 
  <systemitem>&lt;OMS</systemitem> S  symbnameatt S cdnameatt  S? <systemitem>/&gt;</systemitem>
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry>
  <systemitem>&lt;OMS</systemitem> S cdnameatt S symbnameatt S? <systemitem>/&gt;</systemitem>
</entry>
</row>
<row>
<entry>


variable </entry><entry> <math>&longrightarrow;</math> </entry><entry>
   <systemitem>&lt;OMV</systemitem> S varnameatt S? <systemitem>/&gt;</systemitem>
</entry>
</row>
<row>
<entry>

         </entry><entry> <math><mi>|</mi></math> 
         </entry><entry> <systemitem>&lt;OMATTR</systemitem> S?<systemitem>&gt;</systemitem> SC? omatp SC? variable SC? <systemitem>&lt;/OMATTR</systemitem> S?<systemitem>&gt;</systemitem>
</entry>
</row>
<row>
<entry>

omatp </entry><entry> <math>&longrightarrow;</math> </entry><entry>
     <systemitem>&lt;OMATP</systemitem> S?<systemitem>&gt;</systemitem> SC? attrs SC? <systemitem>&lt;/#1</systemitem> S?<systemitem>&gt;</systemitem> 
</entry>
</row>
<row>
<entry>




object </entry><entry> <math>&longrightarrow;</math> </entry><entry> symbol 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>           </entry><entry> variable 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>           </entry><entry> <systemitem>&lt;OMI</systemitem> S?<systemitem>&gt;</systemitem> S? integer S? <systemitem>&lt;/OMI</systemitem> S?<systemitem>&gt;</systemitem>
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>           </entry><entry> <systemitem>&lt;OMF</systemitem> S fpdecatt S? <systemitem>/&gt;</systemitem>
</entry>
</row>
<row>
<entry>
  
       </entry><entry> <math><mi>|</mi></math>           </entry><entry> <systemitem>&lt;OMF</systemitem> S fphexatt S? <systemitem>/&gt;</systemitem>
</entry>
</row>
<row>
<entry>
  
       </entry><entry> <math><mi>|</mi></math>           </entry><entry> <systemitem>&lt;OMSTR</systemitem> S?<systemitem>&gt;</systemitem> char <systemitem>&lt;/#1</systemitem> S?<systemitem>&gt;</systemitem> 
</entry>
</row>
<row>
<entry>
  
       </entry><entry> <math><mi>|</mi></math>           </entry><entry> <systemitem>&lt;OMB</systemitem> S?<systemitem>&gt;</systemitem> base64  <systemitem>&lt;/#1</systemitem> S?<systemitem>&gt;</systemitem> 
</entry>
</row>
<row>
<entry>
   
       </entry><entry> <math><mi>|</mi></math>           </entry><entry> <systemitem>&lt;OMA</systemitem> S?<systemitem>&gt;</systemitem> SC? object SC? objects SC? <systemitem>&lt;/OMA</systemitem> S?<systemitem>&gt;</systemitem>
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>           </entry><entry> <systemitem>&lt;OMBIND</systemitem> S?<systemitem>&gt;</systemitem> SC? object SC? 
</entry>
</row>
<row>
<entry>

       </entry><entry></entry><entry> <systemitem>&lt;OMBVAR</systemitem> S?<systemitem>&gt;</systemitem> SC? variables SC? <systemitem>&lt;/#1</systemitem> S?<systemitem>&gt;</systemitem> 
</entry>
</row>
<row>
<entry>

       </entry><entry></entry><entry> SC? object SC? <systemitem>&lt;/OMBIND</systemitem> S?<systemitem>&gt;</systemitem>
</entry>
</row>
<row>
<entry>
 
       </entry><entry> <math><mi>|</mi></math>           </entry><entry> <systemitem>&lt;OME</systemitem> S?<systemitem>&gt;</systemitem> SC? symbol SC? objects SC? <systemitem>&lt;/OME</systemitem> S?<systemitem>&gt;</systemitem> 
</entry>
</row>
<row>
<entry>
  
       </entry><entry> <math><mi>|</mi></math>           </entry><entry> <systemitem>&lt;OMATTR</systemitem> S?<systemitem>&gt;</systemitem> SC?  <systemitem>&lt;OMATP</systemitem> S?<systemitem>&gt;</systemitem> SC? attrs SC? <systemitem>&lt;/#1</systemitem> S?<systemitem>&gt;</systemitem>   
</entry>
</row>
<row>
<entry>

       </entry><entry></entry><entry>SC? object SC? <systemitem>&lt;/OMATTR</systemitem> S?<systemitem>&gt;</systemitem>  
</entry>
</row>
<row>
<entry>
  
 

attrs  </entry><entry> <math>&longrightarrow;</math> </entry><entry> symbol S? object   
</entry>
</row>
<row>
<entry>
 
       </entry><entry> <math><mi>|</mi></math>               </entry><entry> symbol S? object S? attrs 
</entry>
</row>
<row>
<entry>



objects </entry><entry> <math>&longrightarrow;</math> </entry><entry> SC?     
</entry>
</row>
<row>
<entry>

        </entry><entry> <math><mi>|</mi></math>               </entry><entry> object SC? objects  
</entry>
</row>
<row>
<entry>


variables </entry><entry> <math>&longrightarrow;</math> </entry><entry> SC?   
</entry>
</row>
<row>
<entry>
 
        </entry><entry> <math><mi>|</mi></math>               </entry><entry> variable SC? variables  
</entry>
</row>
<row>
<entry>

</entry>
</row>
</tbody>
</tgroup>
</informaltable>


</figure>
</section>

<section id="sec_xml-desc">
<title>Description of the Grammar</title>



<para>An encoded &OM; object is placed inside an <systemitem>OMOBJ</systemitem> element.  This 
element can contain the elements (and integers) as described above.</para>

<para>We briefly discuss the &exml; encoding for each type of &OM; object
starting from the basic objects.</para>

<variablelist>
<varlistentry><term>Integers</term>
<listitem>
<sidebar revision="1999/09/22" author="DPC"><para>White space allowed in integer strings</para></sidebar>
<para>are encoded using the <systemitem>OMI</systemitem> element around the
  sequence of their digits in base 10 or 16 (most significant digit
  first).  White space may be inserted between the characters of the
  integer representation, this will be ignored.
  After ignoring white
  space, integers written in base 10 match the regular expression
  <systemitem>-?[0-9]+</systemitem>.  Integers written in base 16 match
  <systemitem>-?x[0-9A-F]+</systemitem>.
The integer 10 can be thus encoded as
  <systemitem>&lt;OMI> 10 &lt;/OMI> </systemitem> or as <systemitem>&lt;OMI> xA &lt;/OMI> </systemitem>
  but neither <systemitem>&lt;OMI> +10 &lt;/OMI></systemitem> nor <systemitem>&lt;OMI> +xA &lt;/OMI></systemitem>
  can be used.</para>

<para>The negative integer <math><mn>-120</mn></math> can be encoded as either as decimal
       <systemitem>&lt;OMI> -120 &lt;/OMI></systemitem> or  as hexadecimal <systemitem>&lt;OMI> -x78 &lt;/OMI></systemitem>.</para>

  
</listitem>
</varlistentry>
<varlistentry><term>Symbols</term><listitem><para>are encoded using the <systemitem>OMS</systemitem> element. This element
  has two &exml;-attributes <systemitem>cd</systemitem> and <systemitem>name</systemitem>. The value
  of <systemitem>cd</systemitem> is the name of the Content Dictionary in which the
  symbol is defined and the value of <systemitem>name</systemitem> is the name of the
  symbol.  The name of the Content Dictionary is compulsory, but a
  future revision of the &OM; standard might introduce a defaulting
  mechanism.  For example, <systemitem>&lt;OMS cd="transc" name="sin"/></systemitem> is the
  encoding of the symbol named <systemitem>sin</systemitem> in the Content Dictionary
  named <systemitem>transc</systemitem>.</para>
  
</listitem>
</varlistentry>
<varlistentry><term>Variables</term><listitem><para>are encoded using the <systemitem>OMV</systemitem> element, with only
  one &exml;-attribute, <systemitem>name</systemitem>, whose value is the variable
  name.  The variable name is a subset of the printable <acronym>ascii</acronym>
 set of characters.
  In particular, neither spaces nor double-quote <systemitem>&quot;</systemitem> are allowed
  in variable names.  For instance, the encoding of the object
  representing the variable <math><mi>x</mi></math> is:
    <systemitem>&lt;OMV  name="x"/></systemitem></para>

  
</listitem>
</varlistentry>
<varlistentry><term>Floating-point numbers</term><listitem><para>are encoded using the <systemitem>OMF</systemitem> element
  that has either the &exml;-attribute <systemitem>dec</systemitem> or the
  &exml;-attribute <systemitem>hex</systemitem>. The two &exml;-attributes
  cannot be present simultaneously. The value of <systemitem>dec</systemitem> is the
  floating-point number expressed in base 10, using the common syntax:</para>
  
  <blockquote><para>
  <systemitem>(-?)([0-9]+)?("."[0-9]+)?(e(-?)[0-9]+)?</systemitem>.
  </para></blockquote>
  
  <para>The value of <systemitem>hex</systemitem> is the digits of the floating-point number
  expressed in base 16, with digits <systemitem>0</systemitem>-<systemitem>9</systemitem>, <systemitem>A</systemitem>-<systemitem>F</systemitem>
  (mantissa, exponent, and sign from lowest to highest bits) using a
  least significant byte ordering.  For example, <systemitem>&lt;OMF
    dec="1.0e-10"/></systemitem> is a valid floating-point number.</para>
  

</listitem>
</varlistentry>
<varlistentry><term>Character strings</term><listitem><para>are encoded using the <systemitem>OMSTR</systemitem> element.
  Its content is  a Unicode text (The default encoding 
  is <acronym>utf-8</acronym><citation>utf8</citation>, although &exml; encoded OpenMath may be embedded
 in a containing &exml; document that specifies alternative encoding in
  the &exml; declaration. Note that as always in &exml; the
  characters <systemitem>&lt;</systemitem> and <systemitem>&amp;</systemitem>  need to be represented by the
  entity references <systemitem>&lt;</systemitem> and <systemitem>&amp;</systemitem> respectively.</para>
  
</listitem>
</varlistentry>
<varlistentry><term>Bytearrays</term><listitem><para>are encoded using the <systemitem>OMB</systemitem> element. Its content
  is a sequence of characters that is a base64 encoding of the data.
  The base64 encoding is defined in <acronym>rfc</acronym> 1521 <citation>rfc1521</citation>.
  Basically, it represents an arbitrary sequence of octets using 64
  <quote>digits</quote> (<systemitem>A</systemitem> through <systemitem>Z</systemitem>, <systemitem>a</systemitem> through <systemitem>z</systemitem>, <systemitem>0</systemitem> through <systemitem>9</systemitem>, <systemitem>+</systemitem> and /, in order of increasing
  value). Three octets are represented as four digits (the <systemitem>=</systemitem>
  character for padding to the right at the end of the data). All line
  breaks and carriage return, space, form feed and horizontal
  tabulation characters are ignored. The reader is refered to
  <citation>rfc1521</citation> for more detailed information.</para>

</listitem>
</varlistentry>
</variablelist>
 
<para>In detail the encoding of an &OM; object is described below.</para>

<variablelist>
<varlistentry><term>Applications</term><listitem><para>are encoded using the <systemitem>OMA</systemitem> element. The
  application whose root is the &OM; object <math><msub><mi>e</mi><mn>0</mn></msub></math> and whose arguments
  are the &OM; objects <math><msub><mi>e</mi><mn>1</mn></msub></math>, &#8230;, <math><msub><mi>e</mi><mi>n</mi></msub></math> is encoded as <systemitem>&lt;OMA></systemitem>
  <math><msub><mi>C</mi><mn>0</mn></msub></math> <math><msub><mi>C</mi><mn>1</mn></msub></math>&#8230; <math><msub><mi>C</mi><mi>n</mi></msub></math> <systemitem>&lt;/OMA></systemitem> where <math><msub><mi>C</mi><mi>i</mi></msub></math> is the encoding of
  <math><msub><mi>e</mi><mi>i</mi></msub></math>.</para>

<para>For example, <math><mi mathvariant="bold">application</mi><mo>(</mo><mi>sin</mi><mo>,</mo><mi>x</mi> <mo>)</mo></math> is encoded as:
<systemitem><![CDATA[
            <OMA>  
            <OMS cd="transc1" name="sin"/> 
            <OMV name="x"/>  
            </OMA> 
]]></systemitem> 
  provided that the symbol <systemitem>sin</systemitem> is defined to be a function
  symbol in a Content Dictionary named <systemitem>transc1</systemitem>.</para>

  
</listitem>
</varlistentry>
<varlistentry><term>Binding</term><listitem><para>is encoded using the <systemitem>OMBIND</systemitem> element.  The binding
  by the &OM; object <math><mi>b</mi></math> of the &OM; variables <math><msub><mi>x</mi><mn>1</mn></msub></math>, <math><msub><mi>x</mi><mn>2</mn></msub></math>,
  <math><mi>&#8230;</mi></math>, <math><msub><mi>x</mi><mi>n</mi></msub></math> in the object <math><mi>c</mi></math> is encoded as <systemitem>&lt;OMBIND></systemitem> <math><mi>B</mi></math>
  <systemitem>&lt;OMBVAR></systemitem> <math><msub><mi>X</mi><mn>1</mn></msub></math> <math><mi>&#8230;</mi></math> <math><msub><mi>X</mi><mi>n</mi></msub></math> <systemitem>&lt;/OMBVAR></systemitem> <math><mi>C</mi></math> <systemitem>&lt;/OMBIND></systemitem> where <math><mi>B</mi></math>, <math><mi>C</mi></math>, and <math><msub><mi>X</mi><mi>i</mi></msub></math> are the encodings of <math><mi>b</mi></math>, <math><mi>c</mi></math>
  and <math><msub><mi>x</mi><mi>i</mi></msub></math>, respectively.</para>

<para>For instance the encoding of
  <math><mi mathvariant="bold">binding</mi>
       <mo>(</mo><mi>lambda</mi><mo>,</mo>
  <mi>x</mi><mo>,</mo><mi mathvariant="bold">application</mi>
     <mo>(</mo><mi>sin</mi><mo>,</mo> <mi>x</mi><mo>)</mo><mo>)</mo></math> is:
<systemitem><![CDATA[ 
      <OMBIND>
        <OMS cd="fns1" name="lambda"/>  
        <OMBVAR>
          <OMV name="x"/>
        </OMBVAR>  
        <OMA>
          <OMS cd="transc1" name="sin"/> 
          <OMV name="x"/>  
        </OMA>
      </OMBIND>
]]></systemitem> </para>
  
<para>Binders are defined in  Content Dictionaries, in particular,
  the symbol <systemitem>lambda</systemitem> is defined in the Content Dictionary
  <systemitem>fns1</systemitem> for functions over functions.</para>
  
</listitem>
</varlistentry>
<varlistentry><term>Attributions</term><listitem><para>are encoded using the <systemitem>OMATTR</systemitem> element.  If
  the &OM; object <math><mi>e</mi></math> is attributed with (<math><msub><mi>s</mi><mn>1</mn></msub></math>, <math><msub><mi>e</mi><mn>1</mn></msub></math>), &#8230;, 
  (<math><msub><mi>s</mi><mi>n</mi></msub></math>, <math><msub><mi>e</mi><mi>n</mi></msub></math>) pairs (where <math><msub><mi>s</mi><mi>i</mi></msub></math> are the attributes), it is encoded
  as <systemitem>&lt;OMATTR></systemitem> <systemitem>&lt;OMATP></systemitem> <math><msub><mi>S</mi><mn>1</mn></msub></math> <math><msub><mi>C</mi><mn>1</mn></msub></math> &#8230; <math><msub><mi>S</mi><mi>n</mi></msub></math> <math><msub><mi>C</mi><mi>n</mi></msub></math> <systemitem>&lt;/OMATP></systemitem> <math><mi>E</mi></math> <systemitem>&lt;/OMATTR></systemitem> where <math><msub><mi>S</mi><mi>i</mi></msub></math> is the encoding of the
  symbol <math><msub><mi>s</mi><mi>i</mi></msub></math>, <math><msub><mi>C</mi><mi>i</mi></msub></math> of the object <math><msub><mi>e</mi><mi>i</mi></msub></math> and <math><mi>E</mi></math> is the encoding of
  <math><mi>e</mi></math>.</para>

<para>Examples are the use of attribution to decorate a group by its
  automorphism group:
<systemitem><![CDATA[
          <OMATTR>    
             <OMATP>
                  <OMS cd="groups" name="automorphism_group" />  
                  [..group-encoding..] 
             </OMATP>  
             [..group-encoding..] 
          </OMATTR>  
]]></systemitem> 
or to express the type of a variable:
<systemitem><![CDATA[ 
          <OMATTR>    
              <OMATP>
                   <OMS cd="ecc" name="type" /> 
                   <OMS cd="ecc" name="real" />
              </OMATP> 
              <OMV name="x" />
          </OMATTR>
]]></systemitem> </para>

  
</listitem>
</varlistentry>
<varlistentry><term>Errors</term><listitem><para>are encoded using the <systemitem>OME</systemitem> element. The error whose
  symbol is <math><mi>s</mi></math> and whose arguments are the &OM; objects <math><msub><mi>e</mi><mn>1</mn></msub></math>,
  &#8230;, <math><msub><mi>e</mi><mi>n</mi></msub></math> is encoded as <systemitem>&lt;OME></systemitem> <math><msub><mi>C</mi><mi>s</mi></msub></math> <math><msub><mi>C</mi><mn>1</mn></msub></math>&#8230; <math><msub><mi>C</mi><mi>n</mi></msub></math> <systemitem>&lt;/OME></systemitem> where <math><msub><mi>C</mi><mi>s</mi></msub></math> is the encoding of <math><mi>s</mi></math> and <math><msub><mi>C</mi><mi>i</mi></msub></math> the encoding
  of <math><msub><mi>e</mi><mi>i</mi></msub></math>.</para>

<para>If an <systemitem>aritherror</systemitem> Content Dictionary contained a
  <systemitem>DivisionByZero</systemitem> symbol, then the object
  <math><mi mathvariant="bold">error</mi><mo>(</mo><mi>DivisionByZero</mi><mo>,</mo> <mi mathvariant="bold">application</mi>
  <mo>(</mo><mi>divide</mi><mo>,</mo> 
  <mi>x</mi><mo>,</mo> <mn>0</mn><mo>)</mo><mo>)</mo></math> would be encoded as follows:

<systemitem><![CDATA[ 
            <OME>
            <OMS cd="aritherror" name="DivisionByZero"/>  
            <OMA>
                 <OMS cd="arith1" name="divide" />
                 <OMV name="x"/>  
                 <OMI> 0 </OMI>
            </OMA> 
            </OME>
]]></systemitem></para>   
  
</listitem>
</varlistentry>
</variablelist>



</section>

<section id="xmldoc">
<title>Embedding OpenMath in XML Documents</title>

<sidebar revision="1999/09/21" author="DPC"><para>New section on embedding OM in XML documents</para></sidebar>     
<para>The above encoding of &exml; encoded &OM; specifies the grammar to be
used in files that encode a single &OM; object, and specifies the
character streams that a conforming &OM; application should be able
to accept or produce.</para>

<para>When embedding &exml; encoded &OM; objects into a larger XML document
one may wish, or need, to use other XML features. For example use of
extra &exml; attributes to specify &exml; Namespaces&#160;<citation>xmlns</citation>
or xml:lang attributes to specify the language used in strings&#160;<citation>xml</citation>.
Also, the encoding used in the larger document may not be <acronym>utf-8</acronym>.</para>

<sidebar revision="2000/03/20" author="DPC"><para>Namespace URI, as discussed on OM Soc list</para></sidebar> 
<para>In particular, if &OM;  is used with applications that use the XML
Namespace Recommnedation&#160;<citation>xmlns</citation> then they should ensure
that &OM; elements are in the namespace
<phrase role="tt">http://www.openmath.org/OpenMath</phrase>
This is most conveniently achieved by adding the namespace declaration
<literallayout>
xmlns="http://www.openmath.org/OpenMath"
</literallayout>
as an attribute to each <systemitem>OMOBJ</systemitem> element in the document.</para>

<para>If such &exml; features are used then the &exml; application controlling the
document must, if passing the &OM; fragment to an &OM; application,
remove any such extra attributes and must ensure that the
fragment is encoded according to the grammar specified above.</para>
</section>
</section>

<section id="sec_binary">
<title>The Binary Encoding</title>



<para>The binary encoding was essentially designed to be more compact than
the &exml; encodings, so that it can be more efficient if large
amounts of data are involved. For the current encoding, we tried to
keep the right balance between compactness, speed of encoding and
decoding and simplicity (to allow a simple specification and easy
implementations).</para>

<section id="sec_binary_grammar">
<title>A Grammar for the Binary Encoding</title>



<sidebar revision="1999/06/24" author="DPC"><para>New attrvar production</para></sidebar>     

<figure id="fig_bin-enc">
    <title>Grammar of the binary encoding of &OM; objects.</title>
    
<informaltable>
<tgroup cols="3">
<tbody>
<row>
<entry>
start  </entry><entry> <math>&longrightarrow;</math> </entry><entry> [24] object [25] 
</entry>
</row>
<row>
<entry>

object </entry><entry> <math>&longrightarrow;</math> </entry><entry> integer 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry> float 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry> variable 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry> symbol 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry> string 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry> bytearray 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry> construct 
</entry>
</row>
<row>
<entry>

integer </entry><entry><math>&longrightarrow;</math> </entry><entry> [1] [_] 
</entry>
</row>
<row>
<entry>

        </entry><entry><math><mi>|</mi></math>               </entry><entry> [1 + 128] {_} 
</entry>
</row>
<row>
<entry>

        </entry><entry><math><mi>|</mi></math>               </entry><entry> [2] [n] [_] digits:n 
</entry>
</row>
<row>
<entry>

        </entry><entry><math><mi>|</mi></math>               </entry><entry> [2 + 128] {n} [_] digits:n 
</entry>
</row>
<row>
<entry>

float  </entry><entry> <math>&longrightarrow;</math> </entry><entry> [3] {_} {_} 
</entry>
</row>
<row>
<entry>

variable </entry><entry> <math>&longrightarrow;</math> </entry><entry> [5] [n] varname:n 
</entry>
</row>
<row>
<entry>

         </entry><entry> <math><mi>|</mi></math>               </entry><entry> [5 + 128] {n} varname:n 
</entry>
</row>
<row>
<entry>

         </entry><entry> <math><mi>|</mi></math>               </entry><entry> [5 + 64] [n] 
</entry>
</row>
<row>
<entry>

symbol </entry><entry>  <math>&longrightarrow;</math> </entry><entry> [8] [n] [m] cdname:n symbname:m 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>                </entry><entry> [8 + 128] {n} {m} cdname:n symbname:m 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>                </entry><entry> [8 + 64] [n] 
</entry>
</row>
<row>
<entry>

string </entry><entry> <math>&longrightarrow;</math> </entry><entry> [6] [n] chars:n 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry> [6 + 128] {n} chars:n 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry> [7] [n] chars:2n 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry> [7 + 128] {n} chars:2n 
</entry>
</row>
<row>
<entry>

       </entry><entry> <math><mi>|</mi></math>               </entry><entry> [7 + 64] [n] 
</entry>
</row>
<row>
<entry>

bytearray </entry><entry> <math>&longrightarrow;</math> </entry><entry> [4] [n] bytes:n 
</entry>
</row>
<row>
<entry>

          </entry><entry> <math><mi>|</mi></math>               </entry><entry> [4 + 128] {n} bytes:n 
</entry>
</row>
<row>
<entry>

construct </entry><entry> <math>&longrightarrow;</math> </entry><entry> [16] object objects [17] 
</entry>
</row>
<row>
<entry>

          </entry><entry> <math><mi>|</mi></math>               </entry><entry> [22] symbol objects [23] 
</entry>
</row>
<row>
<entry>

          </entry><entry> <math><mi>|</mi></math>               </entry><entry> [18] attrpairs object [19] 
</entry>
</row>
<row>
<entry>

          </entry><entry> <math><mi>|</mi></math>               </entry><entry> [26] object bvars object [27] 
</entry>
</row>
<row>
<entry>

attrpairs </entry><entry> <math>&longrightarrow;</math> </entry><entry> [20] pairs [21]
</entry>
</row>
<row>
<entry>

pairs     </entry><entry>  <math>&longrightarrow;</math> </entry><entry> symbol object 
</entry>
</row>
<row>
<entry>

          </entry><entry> <math><mi>|</mi></math>                </entry><entry> symbol object pairs
</entry>
</row>
<row>
<entry>

bvars    </entry><entry> <math>&longrightarrow;</math> </entry><entry> [28] vars [29] 
</entry>
</row>
<row>
<entry>

vars     </entry><entry> <math>&longrightarrow;</math> </entry><entry> attrvar 
</entry>
</row>
<row>
<entry>

         </entry><entry>  <math><mi>|</mi></math>              </entry><entry> attrvar vars 
</entry>
</row>
<row>
<entry>

attrvar  </entry><entry> <math>&longrightarrow;</math> </entry><entry> variable 
</entry>
</row>
<row>
<entry>

         </entry><entry>  <math><mi>|</mi></math>              </entry><entry> [18] attrpairs attrvar [19] 
</entry>
</row>
<row>
<entry>

objects  </entry><entry> <math>&longrightarrow;</math> </entry><entry> 
</entry>
</row>
<row>
<entry>

         </entry><entry> <math><mi>|</mi></math>              </entry><entry> object objects 
</entry>
</row>
</tbody>
</tgroup>
</informaltable>


</figure>







<para><xref linkend="fig_bin-enc"/> gives a grammar for the binary encoding.  The
following conventions are used in this section: [<math><mi>n</mi></math>] denotes a byte
whose value is the integer <math><mi>n</mi></math> (<math><mi>n</mi></math> can range from 0 to 255), {<math><mi>m</mi></math>}
denotes four bytes representing the (unsigned) integer <math><mi>m</mi></math> in network
byte order, [_] denotes an arbitrary byte, {_} denotes an
arbitrary sequence of four bytes.  <emphasis>name</emphasis>:<math><mi>n</mi></math> denotes a sequence
of <math><mi>n</mi></math> bytes named <emphasis>name</emphasis>.  <emphasis>name</emphasis>:2<math><mi>n</mi></math> denotes a sequence of
<math><mn>2</mn><mi>n</mi></math> bytes.  <quote>start</quote> is the start symbol of the grammar.</para>

</section>

<section id="sec_bin-desc">
<title>Description of the Grammar</title>



<para>An &OM; object is encoded as a sequence of bytes starting with the
begin object tag (value&#160;24) and ending with the end object tag
(value&#160;25). These are similar to the <systemitem>&lt;OMOBJ></systemitem> and <systemitem>&lt;/OMOBJ></systemitem> tags of
the &exml; encoding.</para>

<para>The encoding of each kind of &OM; object begins with a tag that is a
single byte, holding a <phrase role="sl">token identifier</phrase> and two flags, the <phrase role="sl">long</phrase> flag and the <phrase role="sl">shared</phrase> flag. The identifier is stored in
the first 6 bits (1 to 6). The long flag is the eighth bit and the
shared flag is the seventh bit.</para>

<para>Here is a description of the binary encodings of every kind of &OM; 
object:


<variablelist>
<varlistentry><term>Integers</term><listitem><para>are encoded depending on how large they are. There are
  four possible formats.  Integers between -128 and 127 are encoded as
  the small integer tag (1) followed by a single byte that is the
  value of the integer (interpreted as a signed character). For
  example 16 is encoded as <systemitem>0x01 0x10</systemitem>.  Integers between
  <math><msup><mn>-2</mn><mn>31</mn></msup></math> (<math><mn>-2147483648</mn></math>) and <math><msup><mn>2</mn><mn>31</mn></msup> <mo>-</mo> <mn>1</mn></math> (<math><mn>2147483647</mn></math>) are encoded as
  the small integer tag with the long flag set followed by the integer
  encoded in little endian format on four bytes (network byte order:
  the most significant byte comes first). For example, 128 is encoded
  as <systemitem>0x81</systemitem> <systemitem>0x00000080</systemitem>.  The most general encoding begins
  with the big integer tag (token identifier 2) with the long flag set
  if the number of bytes in the encoding of the digits is greater or
  equal than 256. It is followed by the length (in bytes) of the
  sequence of digits, encoded on one byte (0 to 255, if the long flag
  was not set) or four bytes (network byte order, if the long flag was
  set).  It is then followed by a byte describing the sign and the
  base.  This 'sign/base' byte is <systemitem>+</systemitem> (0x2B) or <systemitem>-</systemitem> (0x2D)
  for the sign ored with the base mask bits that can be 0 for base 10
  or 0x40 for base 16.  It is followed by the strings of digits (as
  characters) in their natural order (as in the &exml;
  encoding).  For example, 8589934592 (<math><msup><mn>2</mn><mn>33</mn></msup></math>) is encoded <systemitem>0x02
    0x0A 0x2B 0x38353839393334353932</systemitem> and xf&zsp;f&zsp;f&zsp;f&zsp;f&zsp;f&zsp;f1 is
  encoded as <systemitem>0x02 0x08 0x6b 0x6666666666666631</systemitem>.  Note that it is
  permitted to encode a <quote>small</quote> integer in any <quote>bigger</quote> format.</para>
  
</listitem>
</varlistentry>
<varlistentry><term>Symbols</term><listitem><para>are encoded as the symbol tag (8) with the long flag
  set if the maximum of the length of the Content Dictionary name and
  the symbol name is greater than or equal to 256 (note that this
  should never be the case if the rules on symbols and Content
  Dictionary names are applied), then followed by the length of the
  Content Dictionary name as a byte (if the long flag was not set)
  or a four byte integer (in network byte order) followed by the
  length of the symbol name as a byte (if the long flag was not set)
  or a four byte integer (in network byte order), followed by the
  characters of the Content Dictionary name, followed by the
  characters of the symbol name.</para>

  
</listitem>
</varlistentry>
<varlistentry><term>Variables</term><listitem><para>are encoded using the variable tag (5) with the long
  flag set if the number of bytes (characters) in the variable name is
  greater than or equal to 256 (this should never happen if the rules
  on variables are followed).  Then, there is the number of characters
  as a byte (if the long flag was not set) or a four byte integer
  (in network byte order), followed by the characters of the name of
  the variable. For example, the variable x is encoded as <systemitem>0x05
    0x01 0x78</systemitem>.</para>

  
</listitem>
</varlistentry>
<varlistentry><term>Floating-point number</term><listitem><para>are encoded using the floating-point
  number tag (3) followed by eight bytes that are the IEEE 754
  representation&#160;<citation>ieee754_85</citation>, most significant bytes first. For
  example, 0.1 is encoded as <systemitem>0x03 0x000000000000f03f</systemitem>.</para>

  
</listitem>
</varlistentry>
<varlistentry><term>Character string</term><listitem><para>are encoded in two ways depending on whether
  the string contains <acronym>utf-16</acronym> characters or not. If the
  string contains only 8 bit characters, it is encoded as the one
  byte character string tag (6) with the long flag set if the number
  of bytes (characters) in the string is greater than or equal to 256.
  Then, there is the number of characters as a byte (if the length
  flag was not set) or a four byte integer (in network byte order),
  followed by the characters in the string. If the string contains two
  byte characters, it is encoded as the two byte character string
  tag (7) with the long flag set if the number of characters in the
  string is greater or equal to 256. Then, there is the number of
  characters as a byte (if the long flag was not set) or a four byte
  integer (in network byte order), followed by the characters
  (<acronym>utf-16</acronym> encoded  Unicode).</para>
 
  
</listitem>
</varlistentry>
<varlistentry><term>Bytearrays</term><listitem><para>are encoded using the bytearray tag (4) with the
  long flag set if the number of bytes in the number of elements is
  greater than or equal to 256. Then, there is the number of elements,
  as a byte (if the long flag was not set) or a four byte integer
  (in network byte order), followed by the elements of the arrays in
  their normal order.</para>
  
  
</listitem>
</varlistentry>
<varlistentry><term>Applications</term><listitem><para>are encoded using the application tag (16). More
  precisely, the application of <math><msub><mi>E</mi><mn>0</mn></msub></math> to <math><msub><mi>E</mi><mn>1</mn></msub></math>&#8230; <math><msub><mi>E</mi><mi>n</mi></msub></math> is encoded
  using the application tag (16), the sequence of the encodings of
  <math><msub><mi>E</mi><mn>0</mn></msub></math> to <math><msub><mi>E</mi><mi>n</mi></msub></math> and the end application tag (17).</para>
    
</listitem>
</varlistentry>
<varlistentry><term>Bindings</term><listitem><para>are encoded using the binding tag (26). More
  precisely, the binding by <math><mi>B</mi></math> of variables <math><msub><mi>V</mi><mn>1</mn></msub></math>&#8230; <math><msub><mi>V</mi><mi>n</mi></msub></math> in
  <math><mi>C</mi></math> is encoded as the binding tag (26), followed by the encoding of
  <math><mi>B</mi></math>, followed by the binding variables tag (28), followed by the
  encodings of the variables <math><msub><mi>V</mi><mn>1</mn></msub></math> &#8230; <math><msub><mi>V</mi><mi>n</mi></msub></math>, followed by the end
  binding variables tag (29), followed by the encoding of <math><mi>C</mi></math>,
  followed by the end binding tag (27).</para>

  
</listitem>
</varlistentry>
<varlistentry><term>Attribution</term><listitem><para>are encoded using the attribution tag (18). More
  precisely, attribution of the object <math><mi>E</mi></math> with (<math><msub><mi>S</mi><mn>1</mn></msub></math>, <math><msub><mi>E</mi><mn>1</mn></msub></math>),
  <math><mi>&#8230;</mi></math> (<math><msub><mi>S</mi><mi>n</mi></msub></math>, <math><msub><mi>E</mi><mi>n</mi></msub></math>) pairs (where <math><msub><mi>S</mi><mi>i</mi></msub></math> are the attributes) is
  encoded as the attributed object tag (18), followed by the encoding
  of the attribute pairs as the attribute pairs tag (20), followed by
  the encoding of each symbol and value, followed by the end attribute
  pairs tag (21), followed by the encoding of <math><mi>E</mi></math>, followed by the end
  attributed object tag (19).</para>

    
</listitem>
</varlistentry>
<varlistentry><term>Error</term><listitem><para>are encoded using the error tag (22). More precisely,
  <math><msub><mi>S</mi><mn>0</mn></msub></math> applied to <math><msub><mi>E</mi><mn>1</mn></msub></math>&#8230; <math><msub><mi>E</mi><mi>n</mi></msub></math> is encoded as the error tag (22),
  the encoding of <math><msub><mi>S</mi><mn>0</mn></msub></math>, the sequence of the encodings of <math><msub><mi>E</mi><mn>0</mn></msub></math> to
  <math><msub><mi>E</mi><mi>n</mi></msub></math> and the end error tag (23).</para>

</listitem>
</varlistentry>
</variablelist> 
</para>
<section id="sec_sharing">
<title>Sharing</title>
 
<para>This binary encoding supports the sharing of symbols, variables and
strings (up to a certain length for strings) within one object. That
is, sharing between objects is not supported.  A reference to a shared
symbol, variable or string is encoded as the corresponding tag with
the long flag not set and the shared flag set, followed by a positive
integer <math><mi>n</mi></math> coded on one byte (0 to 255). This integer references the
<math><mi>n</mi> <mo>+</mo> <mn>1</mn></math>-th such sharable sub-object (symbol, variable or string up to
255 characters) in the current &OM; object (counted in the order they
are generated by the encoding).  For example, <systemitem>0x48 0x01</systemitem>
references a symbol that is identical to the second symbol that was
found in the current object.  Strings with 8 bit characters and
strings with 16 bit characters are two different kinds of objects for
this sharing. Only strings containing less than 256 characters can be
shared (i.e. only strings up to 255 characters).</para>
</section>
</section>

<section id="sec_impl_note">
<title>Implementation Note</title>
<para>A typical implementation of the binary encoding uses four tables, each
of 256 entries, for symbol, variables, 8 bit character strings whose
lengths are less than 256 characters and 16 bit character strings
whose lengths are less than 256 characters.  When an object is read,
all the tables are first flushed. Each time a sharable sub-object is
read, it is entered in the corresponding table if it is not full. When
a reference to the shared i-th object of a given type is read, it
stands for the i-th entry in the corresponding table. It is an
encoding error if the i-th position in the table has not already been
assigned (i.e. forward references are not allowed).  Sharing is not
mandatory, there may be duplicate entries in the tables (if the
application that wrote the object chose not to share optimally).</para>

<para>Writing an object is simple. The tables are first flushed. Each time a
sharable sub-object is encountered (in the natural order of output
given by the encoding), it is either entered in the corresponding
table (if it is not full) and output in the normal way or replaced by
the right reference if it is already present in the table.</para>
</section>

<section id="sec_bin_example">
<title>Example of Binary Encoding</title>

<para>As an example of this binary encoding, we can consider the &OM; object
whose &exml; encoding is
<literallayout><![CDATA[ 
<OMOBJ>
  <OMA>
    <OMS name="times" cd="arith1"/>
    <OMA>
      <OMS name="plus" cd="arith1"/>
      <OMV name="x"/>
      <OMV name="y"/>
    </OMA>
    <OMA>
      <OMS name="plus" cd="arith1"/>
      <OMV name="x"/>
      <OMV name="z"/>
    </OMA>
  </OMA>
</OMOBJ>
]]></literallayout> 
It is binary encoded as the sequence of bytes given by  the following table.</para>

<informaltable>
<tgroup cols="3">
<thead>
<row>
<entry>
Hex </entry><entry>    Meaning </entry><entry> Hex </entry><entry>    Meaning 
</entry>
</row>
</thead>
<tbody>
<row>
<entry>
18  </entry><entry>    begin object tag   </entry><entry>   
68  </entry><entry>   h  .)
</entry>
</row>
<row>
<entry>
 

10  </entry><entry>    begin application tag</entry><entry>
31  </entry><entry>   1  .)
</entry>
</row>
<row>
<entry>


08  </entry><entry>   symbol tag </entry><entry>
70  </entry><entry>   p (symbol name begin
</entry>
</row>
<row>
<entry>


06  </entry><entry>    cd length </entry><entry>
6c  </entry><entry>   l  .
</entry>
</row>
<row>
<entry>


05  </entry><entry>   name length</entry><entry>
75  </entry><entry>   u  . 
</entry>
</row>
<row>
<entry>


61  </entry><entry>    a (cd name begin</entry><entry>
73  </entry><entry>   s  .) 
</entry>
</row>
<row>
<entry>


72  </entry><entry>    r  .</entry><entry>
05  </entry><entry>   variable tag 
</entry>
</row>
<row>
<entry>


69  </entry><entry>   i  .</entry><entry>
01  </entry><entry>   name length 
</entry>
</row>
<row>
<entry>


74  </entry><entry>   t  .</entry><entry>
78  </entry><entry>   x (name) 
</entry>
</row>
<row>
<entry>


68  </entry><entry>   h  .</entry><entry>
05  </entry><entry>   variable tag 
</entry>
</row>
<row>
<entry>


31  </entry><entry>   1  .)</entry><entry>
01  </entry><entry>   name length 
</entry>
</row>
<row>
<entry>


74  </entry><entry>   t (symbol name begin</entry><entry>
79  </entry><entry>   y (variable name) 
</entry>
</row>
<row>
<entry>


69  </entry><entry>   i  .</entry><entry>
11  </entry><entry>   end application tag 
</entry>
</row>
<row>
<entry>


6d  </entry><entry>   m  .</entry><entry>
10  </entry><entry>   begin application tag 
</entry>
</row>
<row>
<entry>


65  </entry><entry>   e  .</entry><entry>
48  </entry><entry>   symbol tag (with share bit on) 
</entry>
</row>
<row>
<entry>


73  </entry><entry>   s  .)</entry><entry>
01  </entry><entry>   reference to second symbol seen (arith1:plus) 
</entry>
</row>
<row>
<entry>


10  </entry><entry>   begin application tag</entry><entry>
45  </entry><entry>   variable tag (with share bit on) 
</entry>
</row>
<row>
<entry>


08  </entry><entry>   symbol tag</entry><entry>
00  </entry><entry>   reference to first variable seen (x) 
</entry>
</row>
<row>
<entry>


06  </entry><entry>   cd length</entry><entry>
05  </entry><entry>   variable tag 
</entry>
</row>
<row>
<entry>


04  </entry><entry>   name length</entry><entry>
01  </entry><entry>   name length 
</entry>
</row>
<row>
<entry>


61  </entry><entry>   a (cd name begin</entry><entry>
7a  </entry><entry>   z (variable name) 
</entry>
</row>
<row>
<entry>


72  </entry><entry>   r  .</entry><entry>
11  </entry><entry>   end application tag 
</entry>
</row>
<row>
<entry>


69  </entry><entry>   i  .</entry><entry>
11  </entry><entry>   end application tag 
</entry>
</row>
<row>
<entry>


74  </entry><entry>   t  .</entry><entry>
19  </entry><entry>   end object tag 
</entry>
</row>
</tbody>
</tgroup>
</informaltable> 

</section>
</section>

<section id="sec_enc_summary">
<title>Summary</title>

<para>The key points of this chapter are:
<itemizedlist>
<listitem><para>The &exml; encoding for &OM; objects uses most common
  character sets.</para>
</listitem>
<listitem><para>The &exml; encoding is readable, writable and can be
  embedded in most documents and transport protocols.</para>
</listitem>
<listitem><para>The binary encoding for &OM; objects should be used when
  efficiency is a key issue. It is compact yet simple enough to allow
  fast encoding and decoding of objects.</para>
</listitem>
</itemizedlist>
</para>
</section>
</chapter>

<chapter id="cha_cd">
<title>Content Dictionaries</title>


<para>In this chapter we give a brief overview of Content Dictionaries
before explicitly stating their functionality and encoding.</para>
<section id="sec_cd_summary">
<title>Introduction</title>

<para>Content Dictionaries (CDs) are central to the &OM; philosophy of
transmitting mathematical information. It is the &OM; Content
Dictionaries which actually hold the meanings of the objects being
transmitted.</para>

<para>For example if application <math><mi>A</mi></math> is talking to application <math><mi>B</mi></math>, and
sends, say, an equation involving multiplication of matrices, then <math><mi>A</mi></math>
and <math><mi>B</mi></math> must agree on what a matrix is, and on what matrix
multiplication is, and even on what constitutes an equation. All this
information is held within some Content Dictionaries which both
applications agree upon.</para>

<para>A <emphasis> Content Dictionary</emphasis> holds the meanings of
(various) mathematical <quote>words</quote>. These words are &OM; basic objects
referred to as <emphasis>symbols</emphasis> in <xref linkend="sec_omabs"/>.</para>

<para>With a set of symbol definitions (perhaps from several content
Dictionaries), <math><mi>A</mi></math> and <math><mi>B</mi></math> can now talk in a common <quote>language</quote>.</para>

<para>It is important to stress that it is not Content Dictionaries
themselves which are being passed, but some <quote>mathematics</quote> whose
definitions are held within the Content Dictionaries. This means that
the applications must have already agreed on a set of Content
Dictionaries which they <quote>understand</quote> (i.e., can cope with to some
degree).</para>

<sidebar revision="1999/10/04" author="DPC"><para>Rephrase slightly</para></sidebar>
<para>In many cases, the Content
Dictionaries that an application understands will be constant, and be
intrinsic to the application's mathematical use. However the above
approach can also be used for applications which can handle
every Content Dictionary (such as an &OM; parser, or perhaps a
typesetting system), or alternatively for applications which
understand a changeable number of Content Dictionaries (perhaps after
being sent Content Dictionaries in some way).</para>

<para>The primary use of Content Dictionaries is thought to be for designers
of Phrasebooks,the programs which translate between the
&OM; mathematical object and the corresponding (often internal)
structure of the particular application in question. For such a use
the Content Dictionaries have themselves been designed to be as
readable and precise as possible.</para>

<para>Another possible use for &OM; Content Dictionaries could rely on their
automatic comprehension by a machine (e.g., when given definitions of
objects defined in terms of previously understood ones), in which case
Content Dictionaries may have to be passed as data. Towards this end,
a Content Dictionary has been written which contains a set of symbols
sufficient to represent any other Content Dictionary. This means that
Content Dictionaries may be passed in the same way as other (&OM;)
mathematical data.</para>

<sidebar revision="1999/08/24" author="OC"><para>More motivation on design of CDs</para></sidebar>
<para>Finally, the syntax of the Content Dictionaries has been designed to
be relatively easy to learn and to write, and also free from the need
for any specialist software. This is because it is acknowledged that
there is an enormous amount of mathematical information to represent,
and so most of the Content Dictionaries will be written by
<quote>ordinary</quote> mathematicians, encoding their particular fields of
expertise. 
A further reason is that the mathematics conveyed by a
specific Content Dictionary should be understandable independently of
any application.</para>

<para>The key points from this section are:

<itemizedlist>
<listitem><para>Content Dictionaries should be readable and precise to help
  Phrasebook designers,</para>
</listitem>
<listitem><para>Content Dictionaries should be readily write-able to encourage
  widespread use,</para>
</listitem>
<listitem><para>It ought to be possible for a machine to understand a Content
  Dictionary to some degree.</para>
</listitem>
</itemizedlist>
</para>
</section>

<section id="sect_func">
<title>Content Dictionaries</title>

<para>In this section we define the overall structure of Content
Dictionaries.</para>

<sidebar revision="1999/08/24" author="OC"><para>New paragraph to reflect recent changes</para></sidebar>
<para>Other than Content Dictionary comments (which have no real semantics),
Content Dictionaries have been designed to hold two types of
information: that which is pertinent to the whole Content Dictionary,
and that which is restricted to a particular symbol definition.
Specific information pertaining to the symbols like the signature and
the defining mathematical properties is conveyed in additional files
associated to Content Dictionaries.</para>

<para>Information that is pertinent to the whole Content Dictionary
includes:
<itemizedlist>
<listitem><para>The name of the Content Dictionary.</para>
</listitem>
<listitem><para>A description of the Content Dictionary.</para>
</listitem>
<listitem><para>A date when the Content Dictionary is next planned to be reviewed.</para>
</listitem>
<listitem><para>A date on which the Content Dictionary was last edited.</para>
</listitem>
<listitem><para>The current version and revision  numbers of the Content Dictionary.</para>
</listitem>
<listitem><para>The status of the Content Dictionary.</para>
</listitem>
<listitem><para>An optional URL for this Content Dictionary.</para>
</listitem>
<listitem><para>An optional list of Content Dictionaries on which this Content
  Dictionary depends. That is, those named in Examples and FMP
  in this Content Dictionary.</para>
</listitem>
<listitem><para>An optional comment, possibly containing the author's name.</para>
</listitem>
</itemizedlist>
</para>

<para>Information that is restricted to a particular symbol includes:
<itemizedlist>
<listitem><para>The name of the symbol.</para>
</listitem>
<listitem><para>A description of this symbol.</para>
</listitem>
<listitem><para>An optional comment.</para>
</listitem>
<listitem><para>Optional properties that this symbol should obey.</para>
</listitem>
<listitem><para>Optional examples of the use of this symbol.</para>
 </listitem>
</itemizedlist>
</para>
<sidebar revision="1999/08/24" author="OC"><para>removed refs to old changes</para></sidebar>
<sidebar revision="1999/06/22" author="OC"><para>new paragraph</para></sidebar>
<sidebar revision="1999/08/24" author="OC"><para>Defmp added</para></sidebar>
<sidebar revision="1999/10/04" author="DPC"><para>Rephrase slightly</para></sidebar>

<para>As mentioned earlier, certain kinds of data pertaining to symbols may
be conveyed in files other than a Content Dictionary.  In particular,
information on signatures according to a type system
may be described in <emphasis>Signature Files</emphasis> whose format is given in
<xref linkend="sigfiles"/>. Other information such as
presentation forms, extra defining mathematical properties may be
associated with Content Dictionaries using files whose format is not
specified by this standard. It is expected that a common method
of defining the presentation for &OM; symbols is via
<acronym>xsl</acronym>&#160;<citation>XSL_99</citation> stylesheets giving transformations to MathML.</para>

<sidebar revision="2000/04/10" author="DPC">
<para>MathML 2</para>
</sidebar>
<para>Content Dictionaries may be grouped into <emphasis>CD Groups</emphasis>. These
groups allow applications to easily refer to collections of Content
Dictionaries. One particular CDGroup of interest is the <quote>MathML
CDGroup</quote>. This group expresses the collection of the core Content
Dictionaries that is designed to have the same semantic scope as the
content elements of MathML&#160;2&#160;<citation>MathML_2000</citation>.
 &OM; objects built from
symbols that come from Content Dictionaries in this CDGroup may be
expected to be eaily transformed between &OM; and MathML encodings.
The detailed structure of a CDGroup is described in
section&#160;<xref linkend="ssec_cdgroups"/> below.</para>

</section>

<section id="sec_xml_cd">
<title>The XML Encoding for Content Dictionaries</title>




<para>Content Dictionaries are XML documents.  A valid Content Dictionary
document should 
<itemizedlist>
<listitem><para>be valid according to  the DTD given in <xref linkend="fig_cd-dtd"/>,</para>
</listitem>
<listitem><para>adhere to the extra conditions on the content of the elements
  given in <xref linkend="sect_pcdata"/>.</para>
</listitem>
</itemizedlist>
</para>



<para>An example of a complete Content Dictionary is given in
Appendix&#160;<xref linkend="app_cdcd"/>, which is the <systemitem>Meta</systemitem> Content Dictionary
for describing Content Dictionaries themselves. A more typical Content
Dictionary is given in Appendix&#160;<xref linkend="arith1.ocd"/>, the <systemitem>arith1</systemitem>
Content Dictionary for basic arithmetic functions.</para>


<section id="sec_dtd_cd">
<title>The DTD Specification of  Content Dictionaries</title>


<figure id="fig_cd-dtd">
    <title>DTD Specification of  Content Dictionaries</title>
<literallayout><![CDATA[
<!-- DTD for OpenMath object -sb-29.10.1998 ->

<!--  general list of embeddable elements
      : excludes OMATP as this is only embeddable in OMATTR
      : excludes OMBVAR as this is only embeddable in OMBIND
-->
<!ENTITY % omel "OMS | OMV | OMI | OMB | OMSTR
                     | OMF | OMA | OMBIND | OME  | OMATTR  ">

<!-- things which can be variables -->
<!ENTITY % omvar        "OMV | OMATTR" >

<!-- symbol -->
<!ELEMENT OMS EMPTY>
<!ATTLIST OMS name CDATA #REQUIRED  cd CDATA #REQUIRED >

<!-- variable -->
<!ELEMENT OMV EMPTY>
<!ATTLIST OMV name CDATA #REQUIRED >

<!-- integer -->
<!ELEMENT OMI (#PCDATA) >

<!-- byte array -->
<!ELEMENT OMB (#PCDATA) >

<!-- string -->
<!ELEMENT OMSTR (#PCDATA) >

<!-- floating point -->
<!ELEMENT OMF EMPTY>
<!ATTLIST OMF dec CDATA #IMPLIED  hex CDATA #IMPLIED>

<!-- apply constructor -->
<!ELEMENT OMA (%omel;)+ >

<!-- binding constructor & variable -->
<!ELEMENT OMBIND ((%omel;), OMBVAR, (%omel;)) >
<!ELEMENT OMBVAR (%omvar;)+ >

<!-- error -->
<!ELEMENT OME    (OMS, (%omel;)* ) >

<!-- attribution constructor & attribute pair constructor -->
<!ELEMENT OMATTR (OMATP, (%omel;)) >
<!ELEMENT OMATP (OMS, (%omel;))+ >

<!-- OM object constructor -->
<!ELEMENT OMOBJ (%omel;) >

<!-- end of DTD for OM object -->

]]>
</literallayout>


</figure>


<para>The XML DTD for Content Dictionaries is given in
<xref linkend="fig_cd-dtd"/>. The allowed elements are further
described in the following section.</para>
</section>

<section id="sect_pcdata">
<title>Further Requirements of an &OM; Content Dictionary</title>


<para>The notion of being a valid Content Dictionary is stronger than merely
being successfully parsed by the DTD. This is because the content of
the elements, referred to in <xref linkend="fig_cd-dtd"/> as PCDATA and
CDATA, must actually make sense to, say, a Phrasebook designer. In
this section we define exactly the format of the elements used in
Content Dictionaries.</para>


<sidebar revision="1999/06/20" author="OC">
<para>now we have this numbering mechanism, should it be documented?</para>
</sidebar>
<variablelist>
<varlistentry><term><systemitem>CDName</systemitem></term><listitem><para>The text occurring in the <systemitem>CDName</systemitem> element
  corresponds to the name of Content Dictionary, and is of the form
  specified in <xref linkend="cha_enco"/>.</para>

</listitem>
</varlistentry>
<varlistentry>
<term><systemitem>Description</systemitem></term>
<listitem><para>The text occurring in the <systemitem>Description</systemitem>
  element is used to give a description of the enclosing element, which
  could be a symbol or the entire Content Dictionary. The content of
  this element can be any XML text.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDReviewDate</systemitem></term><listitem><para>The text occurring in the <systemitem>CDReviewDate</systemitem>
  element corresponds to the earliest possible revision date of the
  Content Dictionary.  The date formats should be ISO-compliant in the
  form YYYY-MM-DD, e.g. 1953-09-26.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDDate</systemitem></term><listitem><para>The text occurring in the <systemitem>CDDate</systemitem> element
  corresponds to the date of this version of the Content Dictionary.
  The date formats should be ISO-compliant in the form YYYY-MM-DD,
  e.g. 1953-09-26.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDVersion</systemitem></term><listitem>
   <sidebar revision="1999/06/23" author="DPC"><para>new paragraph</para></sidebar>
   <sidebar revision="1999/11/24" author="DPC"><para>Now just an integre</para></sidebar>
   <para>The text occurring in the <systemitem>CDVersion</systemitem> element corresponds to
   the version number of the current version of a Content Dictionary.
   It should be a non negative integer.</para>

<para>In CDs that do not have status <emphasis>experimental</emphasis>, CD version
  numbering should adhere to the following. The version number should
  be a positive integer.</para>

<para>No changes can be
  introduced that invalidate objects built with previous versions.
  Any change that influences phrasebook compliance, like adding a new
  symbol to a Content Dictionary, is considered a major change.
  and should be reflected by an increase in this version number. Other
  changes, like adding an example or correcting a description, are
  considered minor changes. For minor changes the version number is not
  changed, but an increas should be made to the revision number, as
  described below. A change such as removing a symbol should
  not be made, instead a new CD, with a different name should be
  produced, so as not to invalidate existing objects.</para>

<para>As detailed in chapter&#160;<xref linkend="cha_comp"/>, &OM; compliant applications
  state which versions of which CDs they support.

  <emphasis>Experimental</emphasis> CDs may expect to have changes such as adding
  or removing symbols as they are developed, without requiring the name
  of the CD to be changed.</para>

  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDRevision</systemitem></term><listitem>
   <sidebar revision="1999/11/24" author="DPC"><para>New field, formally `.y' of version number</para></sidebar>
   <para>The text occurring in the <systemitem>CDRevision</systemitem> element corresponds to
   the revision, or `minor version number' of the current version of a
   Content Dictionary.  It should be a non negative integer.</para>

<para>Minor changes to a CD that do not warrant the release of a CD with
   an increased version number should be marked by increasing the
   revision number specified in this field. When the Cd Version number
   is increased, the Revision number is normally reset to zero.</para>

  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDStatus</systemitem></term><listitem><para>The text occurring in the <systemitem>CDStatus</systemitem> element
  corresponds to the status of Content Dictionary, and can be either
  <systemitem>official</systemitem> (approved by the &OM; Society according to the
  procedure outlined in <xref linkend="cdapprove"/>), <systemitem>experimental</systemitem>
  (currently being tested), <systemitem>private</systemitem> (used by a private group of
  &OM; users) or <systemitem>obsolete</systemitem> (an obsolete Content Dictionary kept
  only for archival purposes).</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDURL</systemitem></term><listitem><para>The text occurring in the <systemitem>CDURL</systemitem> element
  should be a valid URL where the source file for the Content
  Dictionary encoding can be found (if it exists). The filename should
  conform to ISO 9660&#160;<citation>ISO9660</citation>.</para>

</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDUses</systemitem></term><listitem>
   <sidebar revision="1999/06/23" author="DPC"><para>new wording</para></sidebar>
   <para>The content of this element should be a series of <systemitem>CDName</systemitem>
   elements, each naming a Content Dictionary used in the
   <systemitem>Example</systemitem> and <systemitem>FMP</systemitem>s of the current Content
   Dictionary. </para>
   
 </listitem>
</varlistentry>
<varlistentry><term><systemitem>CDComment</systemitem></term><listitem><para>The content of this element should be text
   that does not convey any crucial information concerning the current
   Content Dictionary. It can be used in the Content Dictionary header
   to report the author of the Content Dictionary and to log change
   information. In the body of the Content Dictionary, it can be used
   to attach extra remarks to certain symbols.</para>
   <sidebar revision="1999/10/01" author="OC"><para>Due to lack of inspiration, I added only these few lines</para></sidebar>
   
 </listitem>
</varlistentry>
<varlistentry><term><systemitem>Example</systemitem></term>
<listitem>
<sidebar revision="1999/06/23" author="OC"><para>new description</para></sidebar>
<para>The text occurring in the <systemitem>Example</systemitem> element is used to give
  examples of the enclosing symbol, and can be any XML text. In
  addition to text the element may contain examples as &exml; encoded
  &OM;, inside <systemitem>OMOBJ</systemitem> elements.  Note that <systemitem>Examples</systemitem> must
  be with respect to some symbol and cannot be <quote>loose</quote> in the
  Content Dictionary.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>Name</systemitem></term><listitem><para>The text occurring in the <systemitem>Name</systemitem> element
  corresponds to the name of the symbol, and is specified as in
  <xref linkend="cha_enco"/>.</para>  
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CMP</systemitem></term>
<listitem><para>The text occurring in the <systemitem>CMP</systemitem> element
  corresponds to a property of the symbol. An application which says
  it understands a Content Dictionary symbol need not understand a
  commented property of the symbol.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>FMP</systemitem></term><listitem><para>The content of the <systemitem>FMP</systemitem> element also corresponds
  to a property<footnote id="ftn_theory"><para>It corresponds to a theorem of a theory in
    some formal system.</para></footnote> of the symbol, however the content of this
  element must be a valid &OM; object in the XML encoding.  An
  application which says it understands a Content Dictionary symbol
  need not understand a formal property of the symbol.</para>
</listitem>
</varlistentry>
</variablelist>
</section>
</section>


<section id="addfiles">
<title>Additional Information</title>


<sidebar revision="1999/08/25" author="OC"><para>Introduction to splitting-up in files</para></sidebar>
<sidebar revision="1999/10/04" author="DPC"><para>Rephrase slightly</para></sidebar>
<para>Content Dictionaries contain just one part of the information that can
be associated to a symbol in order to stepwise define its meaning and
its functionality. &OM; Signature files, CDGroups, and possibly
files of extra mathematical properties, are used to convey the different
aspects that as a whole make up a mathematical definition.</para>

<section id="sigfiles">
<title>Signature Files</title>


<sidebar revision="1999/08/25" author="OC"><para>Introduced Signature Files. Early drafts of the
  &OM; standard specified that Content Dictionaries had a Signature
  element in which the <emphasis>signature</emphasis> of the symbol was defined. The
  disadvantage of this approach is that the signature would need to
  reference a specific type system. Signature Files allow for more
  generality.</para></sidebar>

<para>&OM; may be used with any type system. One just needs to produce a
Content Dictionary which gives the constructors of the type system,
and then one may build &OM; objects representing types in the given
type system. These are typically associated with &OM; objects via the
&OM; <varname>attribution</varname> constructor.</para>

<para>A Small Type System, called STS, has been designed to give semi-formal
signatures to &OM; symbols and is documented in&#160;<citation>OM_D132c</citation>.  The
signature file given in <xref linkend="arith1.sts"/> is based on this
formalism. Using the same mechanism, <citation>OMD132b</citation> shows
how pure type systems can also be employed to assign types to &OM; 
symbols.</para>


<section id="sec_dtd_sig">
<title>The  DTD Specification of Signature Files</title>

<para>Signature Files are &exml; documents, hence  a valid Signature File
 should 
<itemizedlist>
<listitem><para>be valid according to the <acronym>dtd</acronym> given in
  <xref linkend="fig_omcdsig.dtd"/>,</para>
</listitem>
<listitem><para>adhere to the extra conditions on the content of the elements
  given in <xref linkend="sect_sigpcdata"/>.</para>
</listitem>
</itemizedlist></para>


<para>Signature files have a header which specifies the Content Dictionary
and determines the type system being used, and the Content
Dictionary which contains the symbols for which the signatures are
being given. Each signature takes the form of an &exml; encoded &OM;
object.</para>


<figure id="fig_omcdsig.dtd">
    <title>DTD Specification of Signature Files</title>
<literallayout><![CDATA[
<!-- omcdsig.dtd -->
<!-- ********************************************* -->
<!--                                               -->
<!-- DTD for OpenMath CD Signatures                -->
<!-- (c) EP24969 the ESPRIT OpenMath Consortium    -->
<!-- David Carlisle 1999-04-13                     -->
<!-- David Carlisle 1999-05-21                     -->
<!-- David Carlisle 1999-06-22                     -->
<!--                                               -->
<!--                                               -->
<!-- ********************************************* -->

<!-- include dtd for OM objects -->
<!ENTITY  % omobjectdtd SYSTEM "omobj.dtd" >
%omobjectdtd;

<!ELEMENT CDSComment      (#PCDATA) >
<!ELEMENT CDSReviewDate    (#PCDATA) >
<!ELEMENT CDSStatus    (#PCDATA) >

<!ELEMENT CDSignatures   (CDComment |CDSComment | CDSReviewDate |
                         CDSStatus | Signature )* >

<!ATTLIST CDSignatures cd CDATA #REQUIRED
                       type CDATA #REQUIRED >

<!ELEMENT Signature      (OMOBJ?) >

<!ATTLIST  Signature  name CDATA #REQUIRED >

<!-- end of DTD for OM CD Signatures -->]]>
</literallayout>
</figure>
</section>

<section id="sect_sigpcdata">
<title>Further Requirements of a Signature File</title>

<sidebar revision="1999/08/26" author="OC"><para>Added PCDATA for Additional Files</para></sidebar>

<para>The notion of being a valid Signature File is stronger than merely
being successfully parsed by the <acronym>dtd</acronym> in <xref linkend="fig_omcdsig.dtd"/>.
In this section we define exactly the format of the elements used in
Signature Files. Several of the requirements are the same as those on
elements of Contents Dictionaries.</para>


<variablelist>
<varlistentry><term><systemitem>CDSignatures</systemitem></term><listitem><para>The outermost element of the Signature File
  is characterized by two required attributes that identify the type
  system and the Content Dictionary whose signatures are defined. The
  value of the &exml; attribute <systemitem>type</systemitem> is the name of the Content
  Dictionary or of the CDGroup (cfg. <xref linkend="ssec_cdgroups"/>) that
  represents the type system. The value of the XML attribute
  <systemitem>cd</systemitem> is the name of the Content Dictionary whose symbols are
  assigned signatures in this Signature File. Both values are of the
  form specified in <xref linkend="cha_enco"/>.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDSComment</systemitem></term><listitem><para>See <systemitem>CDComment</systemitem> in
  <xref linkend="sect_pcdata"/>.</para>

</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDSreviewDate</systemitem></term><listitem><para>The text occurring in the <systemitem>CDSReviewDate</systemitem> element corresponds to the earliest possible
  revision date of the Signature File.  The date formats should be
  ISO-compliant in the form YYYY-MM-DD, e.g. 2000-02-29.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDSStatus</systemitem></term><listitem><para>The text occurring in the <systemitem>CDSStatus</systemitem>
  element corresponds to the status of the Signature File, and can be
  either <systemitem>official</systemitem> (approved by the &OM; Society according to the
  procedure outlined in <xref linkend="cdapprove"/>), <systemitem>experimental</systemitem>
  (currently being tested), <systemitem>private</systemitem> (used by a private group of
  &OM; users) or <systemitem>obsolete</systemitem> (an obsolete Signature File kept only
  for archival purposes).</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>Signature</systemitem></term>
<listitem>
<sidebar revision="1999/08/01" author="OC">
<para>This notion might be too strict, it also need CDUses
possibly</para>
</sidebar>
<para>The content of the <systemitem>Signature</systemitem> element
  has to be a valid &OM; object in &exml; encoding as specified in
  <xref linkend="cha_enco"/>. Additionally, the object must represent a
  valid type in the type system identified by the
  XML attribute <systemitem>type</systemitem> of the <systemitem>CDSignature</systemitem> element. See
  <xref linkend="sect_sigex"/> for examples.</para>

</listitem>
</varlistentry>
</variablelist>
</section>


<section id="sect_sigex">
<title>Examples</title>
<sidebar revision="1999/08/01" author="OC">
<para>arith1.sts is not valid wrt DTD</para>
</sidebar>
<para>An example of a signature file for the type system STS and the
<systemitem>arith1</systemitem> Content Dictionary is given in
<xref linkend="arith1.sts"/>. Each
signature entry is similar to the
following one for the &OM; symbol 
<systemitem>&lt;OMS cd="arith1" name="plus"/></systemitem>:
<literallayout><![CDATA[
<Signature name="plus">
<OMOBJ>
 <OMA>
  <OMS name="mapsto" cd="sts"/>
  <OMA>
   <OMS name="nassoc" cd="sts"/> 
   <OMV name="AbelianSemiGroup"/>
  </OMA>
  <OMV name="AbelianSemiGroup"/>
 </OMA>
</OMOBJ>
</Signature>
]]></literallayout>
</para>
</section>
</section>

<section id="ssec_cdgroups">
<title>CDGroups</title>



<sidebar revision="1999/06/20" author="OC"><para>All new, partly taken from SB paper</para></sidebar>
<sidebar revision="1999/10/04" author="DPC"><para>Rephrase slightly</para></sidebar>
<para>The CD Group mechanism is a convenience mechanism for identifying
collections of CDs.  A CD Group file is an &exml; document used in the
(static or dynamic) negotiation phase where communicating applications
declare and agree on the Content Dictionaries which they process.  It
is a complement, or an alternative, to the individual declaration of
Content Dictionaries understood by an application.  Note that
CD Groups do <emphasis>not</emphasis> affect the &OM; objects themselves.
Symbols in an object always refer to content dictionaries, not groups.</para>

<sidebar revision="1999/06/20" author="OC"><para>Does this go to compliancy?</para></sidebar>
<para>For an application to declare that it <quote>understands CDGroup G</quote> is
exactly equivalent to, and interchangable with, the declaration that it 
<quote>understands Content Dictionaries <math><msub><mi>x</mi><mn>1</mn></msub></math>, <math><msub><mi>x</mi><mn>2</mn></msub></math>, &#8230; <math><msub><mi>x</mi><mi>n</mi></msub></math></quote>, where
<math><msub><mi>x</mi><mn>1</mn></msub></math>, &#8230; <math><msub><mi>x</mi><mi>n</mi></msub></math> are the members of CDGroup G.</para>


<section id="sec_dtd_cdg">
<title>The DTD Specification of CDGroups</title>


<para>CDGroups are XML documents, hence  a valid  CDGroup
 should 
<itemizedlist>
<listitem><para>be valid according to the DTD given in
  <xref linkend="fig_cdgroup.dtd"/>,</para>
</listitem>
<listitem><para>adhere to the extra conditions on the content of the elements
  given in <xref linkend="sect_cdgpcdata"/>.</para>
</listitem>
</itemizedlist>
</para>

<para>Apart from some header information such as <systemitem>CDGroupName</systemitem> and
<systemitem>CDGroup</systemitem> version, a CDGroup is simply an unordered list of
CDs, identified by name and optionally version number and URL.</para>


<figure id="fig_cdgroup.dtd">
    <title>DTD Specification of CDGroups</title>
<literallayout><![CDATA[
<!-- CDgroup.dtd -->
<!-- ********************************************* -->
<!--                                               -->
<!-- DTD for OpenMath CD group                     -->
<!-- (c) EP24969 the ESPRIT OpenMath Consortium    -->
<!-- date = 18.Feb.1999                            -->
<!-- author = s.buswell sb@stilo.demon.co.uk       -->
<!--                                               -->
<!--                                               -->
<!-- available at                                  -->
<!-- http://www.nag.co.uk/~something here David~  -->
<!--                                               -->
<!-- ********************************************* -->

<!-- info on the CD group itself -->

<!ELEMENT CDGroupName      (#PCDATA) >
<!ELEMENT CDGroupVersion     (#PCDATA) >
<!ELEMENT CDGroupRevision     (#PCDATA) >
<!ELEMENT CDGroupURL          (#PCDATA) >
<!ELEMENT CDGroupDescription  (#PCDATA) >

<!-- info on the CDs in the group  -->

<!ELEMENT CDComment     (#PCDATA) >
<!ELEMENT CDGroupMember (CDComment?,CDName, CDVersion?, CDURL?) >
<!ELEMENT CDName     (#PCDATA) >
<!ELEMENT CDVersion     (#PCDATA) >
<!ELEMENT CDURL         (#PCDATA) >

<!-- structure of the group -->
<!ELEMENT CDGroup 
   (CDGroupName, CDGroupVersion, CDGroupRevision?,
    CDGroupURL, CDGroupDescription,
     (CDGroupMember  | CDComment )* ) >

<!-- end of DTD for OM CDGroup -->]]>
</literallayout>

</figure>
</section>

<section id="sect_cdgpcdata">
<title>Further Requirements of a CDGroup</title>


<sidebar revision="1999/08/26" author="OC"><para>Added PCDATA for CDGroup</para></sidebar>

<para>The notion of being a valid CDGroup implies that the following
requirements on the content of the elements described by the DTD in
<xref linkend="fig_omcdsig.dtd"/> are also met.</para>


<variablelist>
<varlistentry><term><systemitem>CDGroup</systemitem></term>
<listitem><para>The XML element <systemitem>CDGroup</systemitem> is the outermost
  element in a CDGroup document.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDGroupName</systemitem></term>
<listitem>
<sidebar revision="1999/08/01" author="OC">
 <para>For consistency, CDGName would be better</para>
</sidebar>
<para>The text occurring in the <systemitem>CDGroupName</systemitem>
  element corresponds to the name of the CDGroup. For the syntactical
  requirements, see <systemitem>CDName</systemitem> in <xref linkend="sect_pcdata"/>.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDGroupURL</systemitem></term>
<listitem><para>The text occurring in the <systemitem>CDGroupURL</systemitem>
  element identifies the location of the CDGroup file, not necessarily
  of the member Content Dictionaries. For the syntactical
  requirements, see <systemitem>CDURL</systemitem> in <xref linkend="sect_pcdata"/>.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDGroupDescription</systemitem></term>
<listitem><para>The text occurring in the <systemitem>CDGroupDescription</systemitem> element describes the mathematical area of the
  CDGroup.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDGroupMember</systemitem></term>
<listitem><para>The XML element <systemitem>CDGroupMember</systemitem>
  encloses the data identifying each member of the CDGroup.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDName</systemitem></term><listitem><para>The text occurring in the <systemitem>CDName</systemitem> element
  corresponds to the name of a Content Dictionary in the CDGroup. For
  the syntactical requirements, see <systemitem>CDName</systemitem> in
  <xref linkend="sect_pcdata"/>.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDVersion</systemitem></term><listitem><para>The text occurring in the <systemitem>CDVersion</systemitem>
  element identifies which version of the Content Dictionary isto be
  taken as member of the CDGroup. This element is optional. In case it
  is missing, the latest version is the one included in the CDGroup.
  For the syntactical requirements, see <systemitem>CDVersion</systemitem> in
  <xref linkend="sect_pcdata"/>.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDURL</systemitem></term>
<listitem>
<sidebar revision="1999/08/01" author="OC">
<para>Or the official CD repository?</para>
</sidebar>
<para>The text occurring in the <systemitem>CDURL</systemitem> element
  identifies the location of the Content Dictionary to be taken as
  member of the CDGroup. This element is optional. In case it is
  missing, the location of the CDGroup identified by the element
  <systemitem>CDGroupURL</systemitem> is assumed.
  For the syntactical requirements, see <systemitem>CDURL</systemitem>
  in <xref linkend="sect_pcdata"/>.</para>
  
</listitem>
</varlistentry>
<varlistentry><term><systemitem>CDComment</systemitem></term><listitem><para>See <systemitem>CDComment</systemitem> in
  <xref linkend="sect_pcdata"/>.</para>



</listitem>
</varlistentry>
</variablelist>




<sidebar revision="1999/10/04" author="DPC">
<para>Delete subsec: Note on Symbols, CDs and CDGroups</para>
</sidebar>
<sidebar revision="2000/04/10" author="DPC">
<para>Delete examples (MathML CDGroup is in appendix, core CDGroup no longer exists</para>
</sidebar>
<sidebar revision="1999/08/25" author="OC">
<para>This section to be added</para>
</sidebar>
<sidebar revision="1999/10/04" author="DPC">
<para>Delete subsec: DefMP Files and XSL</para>
</sidebar>
</section>
</section>
</section>

<section id="cdapprove">
<title>Content Dictionaries Reviewing Process</title>


<sidebar revision="1999/10/04" author="DPC"><para>Rephrase slightly</para></sidebar>
<para>The &OM; Society is responsible  for implementing a
review and referee process to assess the accuracy of the mathematical
content of Content Dictionaries.  The status (see <systemitem>CDStatus</systemitem>)
and/or the version number (see <systemitem>CDVersion</systemitem> ) of a Content
Dictionary may change as a result of this review process.</para>
</section>
</chapter>





<chapter id="cha_comp">
<title>&OM; Compliance</title>

<sidebar revision="1999/11/24" author="DPC/OC">
<para>New chapter, after discussions at Esprit OpenMath meeting in Bath</para>
</sidebar>
<para>Applications that meet the requirements specified in this chapter may
label themselves as <emphasis>OpenMath compliant</emphasis>. &OM; compliancy is
defined so as to maximize the potential for interoperability amongst
&OM; applications.</para>

<section id="sec_compl_encoding">
<title>Encoding</title>
<para>This standard defines two reference encodings for &OM;, the binary
encoding and XML encoding,  defined in chapter&#160;<xref linkend="cha_enco"/>.</para>

<para>As a minimum, an &OM; compliant application, which accepts or generates
&OM; objects, <emphasis>must</emphasis> be capable of doing so using  the XML encoding.
The ability to use other encodings is optional.</para>
</section>

<section id="sec_compl_cd">
<title>Content Dictionaries</title>

<para>An &OM; compliant application  <emphasis>must</emphasis> be able to support the error
Content Dictionary defined in <xref linkend="errorcd"/>.</para>

<para>A compliant application must declare the names and version numbers of
the Content Dictionaries that it supports. Equivalently it may declare
the Content Dictionary Group (or groups) and major version number (not
revision number), rather than listing individual Content Dictionaries.
Applications that support all Content Dictionaries (e.g. renderers)
should refer to the implicit CD Group <systemitem>all</systemitem>.</para>

<para>If a compliant application supports a Content Dictionary then it must
explicitly declare any symbols in the Content Dictionaries that are not
supported. Phrasebooks are encouraged to support every symbol in the 
Content Dictionaries.</para>

<para>Symbols which are not listed as unsupported are <emphasis>supported</emphasis> by
the application. The meaning of <emphasis>supported</emphasis> will depend on the
application domain. For example an &OM; renderer should provide a
default display for any &OM; object that only references supported
symbols, whereas a Computer Algebra System will be expected to map
such an object to a suitable internal representation, in this system,
of this mathematical object. It is expected that the application's
<emphasis>phrasebooks</emphasis> for supported Content Dictionaries will be
constructed such that propertes of the symbol expressed in the Content
Dictionary are respected as far as possible for the given application
domain. However &OM; compliance does <emphasis>not</emphasis> imply
any guarantee by the &OM; Society on the accuracy of these representations.</para>


<para>Content Dictionaries available from the official &OM; repository at
www.openmath.org need only be referenced by name, other Content
Dictionaries <emphasis>should</emphasis> be referenced by the URL declared in the
<systemitem>CDURL</systemitem> field of the Dictionary. This URL may be used to
retrieve the Content Dictionary.</para>

<para>When receiving an &OM; symbol, e.g. <math><mi>s</mi></math>,  that is not supported from a
 supported Content Dictionary,  a compliant
application will act as if it had received the &OM; object
<math display="block"><mi mathvariant="bold">error</mi><mo>(</mo><mi>Unhandled_Symbol</mi><mo>,</mo><mi>s</mi><mo>)</mo></math>
where <systemitem>Unhandled_Symbol</systemitem> is the symbol from the error Content
Dictionary.</para>


<para>Similarly if it receives a symbol, e.g. <math><mi>s</mi></math>, from an unsupported Content
Dictionary,
it  will act as if it had received the &OM; object
<math display="block"><mi mathvariant="bold">error</mi><mo>(</mo><mi>Unsupported_CD</mi><mo>,</mo><mi>s</mi><mo>)</mo></math></para>

<para>Finally if the compliant application receives a symbol from a supported
Content Dictionary but with an unknown name, then this must either be
an incorrect object, or possibly the object has been built using a
later version of the Content Dictionary. In either case, the
application will act as if it had received the &OM; object
<math display="block"><mi mathvariant="bold">error</mi><mo>(</mo><mi>Unexpected_Symbol</mi><mo>,</mo><mi>s</mi><mo>)</mo></math></para>
</section>

<section id="sec_comp_lex">
<title>Lexical Errors</title>

<para>The previous section defines the behaviour of a compliant application
upon receiving well formed &OM; objects containing unexpected symbols.
This standard does not specify any behaviour for an application upon
receiving ill-formed objects.</para>
</section>
</chapter>

<chapter id="cha_conc">
<title>Conclusion</title>


<para>The goal of this document is to define the &OM; standard. The things
are addressed by the &OM; standard are:
<itemizedlist>
<listitem><para>Informal and formal definition of the &OM; objects.</para>
  
</listitem>
<listitem><para>Informal and formal definition of the notion of Content
  Dictionaries.</para>
</listitem>
</itemizedlist>
To do this, &OM; objects are precisely defined and two encodings are
described to represent these objects using <acronym>xml</acronym> and binary
code. Furthermore, the Document Type Definition for validating Content
Dictionaries and &OM; objects is given.</para>


</chapter>


<appendix id="app_cdfiles">
<title>CD Files</title>

<section id="app_cdcd">
<title>The <filename>meta</filename> Content Dictionary</title>

<literallayout><![CDATA[
<CD>

<CDName> meta </CDName>

<Description> 

This is a content dictionary to represent content dictionaries, so
that they may be passed between OpenMath compliant application in a
similar way to mathematical objects.  It is acknowledged that this is
not the only way to do this, but it seems a natural way.

This can be viewed as updating the previous Meta-CD.

The information written here is taken from "The OpenMath Standard".
This document is a slightly stronger statement than "the following
symbols are defined as in the DTD at
http://www.nag.co.uk/projects/OpenMath/omstd/dtds/cd.dtd", since the
DTD often only says that the information inside the elements is PCDATA
without saying what this actually corresponds to. However this is the
only way this document is better than the DTD, and thus this is the
only extra information we give here.

Author: N. Howgrave-Graham
</Description>

<CDReviewDate> 1998-10-01 </CDReviewDate>
<CDStatus> experimental </CDSt