While searching for the best way to specify the APC AA syndication mechanism, a focus was put onto doing it in a way that conforms to existing standards. There is one obvious standard for content syndication, namely RSS, and it became soon clear that RSS would be the basis for the APC AA inter node networking.
The first version of this document specified a new specific version of RSS which derived from RSS 0.91. RSS 0.91, however, itself does not conform to many standards itself. It was developed by Netscape Corp. and they used RSS 0.9 to develop it but stripped away many of the best features of RSS 0.9, that is in the first row, RDF compliance.
The 1.0 version of RSS now is RDF complaint again.
RSS means RDF Site Summary.
In contrast to Netscape's RSS 0.91, RSS 1.0 is, as was RSS 0.9, an application of RDF (Resource Description Framework / http://www.w3.org/RDF/). RDF defines an abstract way to describe everything in the world that can possibly have a Uniform Resource Indicator (URI). RDF makes statements about resources. An RDF resource can be almost anything that can have a unique resource identifier (URI). This does not necessarily mean that the resource must be accessible through the Internet; it only means that it must have a unique identifier. The statements define values for properties of the resources. RDF makes an effort to cover a lot of different situations and aspects, most of them are not relevant for our purposes. RDF also defines a way to write down the statements using XML. The XML representation of RDF data can be written in more than one way for different purposes, though.
RSS defines a class of resources called channel. A channel is quite the same as what we call a feed or slice. A channel has, according to RSS, a title, a description and so on, and it has items. The items, in turn, have a title, a description and a link. RSS also defines XML as the obvious transport method for RSS data, and it makes some constraints on RDF regarding possible XML representations of the RSS data. The RSS specification is by far less verbose than RDF; the included examples and de facto usage make it much clearer.
One of the main goals that were followed when the RSS 1.0 specification ( http://purl.org/rss/1.0/) was developed was extensibility. RSS 1.0 has a feature that is called RSS modules which allows anyone to extend RSS 1.0 to add new fields for their purposes. The modularization is described in the RSS 1.0 module specification ( http://purl.org/rss/1.0/modules/).
RSS 1.0 modules use an XML feature called name spaces ( http://www.w3.org/TR/REC-xml-names/) to create separate and non conflicting name spaces for modules, so it is possible to "pull in" any RSS 1.0 module into an RSS 1.0 document. The parser will be able to recognize to which name space an element belongs and can thus ignore unknown elements.
There are three standard modules which "ship in the box" with RSS 1.0:
The data model we need must be able to transport all data we want to exchange between nodes. This data describes
The term feed or slice is used in this section of this document to clarify that the data model is not limited to transport information about what we call slice in the Action Application; it is rather prepared to have information on a feed which, in the first implementation, will indeed come from a single slice of the Action Applications.
In this version of this specification, not all fields of any slice or feed are transported by the mechanism.
The data that describes the feed or slice we need to be able to carry is:
The data that describes the actual current items need to include:
The RSS 1.0 standard ( http://purl.org/rss/1.0/) which we are about to extend defines a channel entity which consists of
Each item consists of three fields:
The Dublin Core module adds 15 new elements to both channels and items; these are:
The syndication module adds these three fields:
This means that we can use some of the fields of RSS 1.0 and the "out of the box modules", but we need to extend this so we can carry all of our information in it. This is particulary true because we have a requirement to carry formatting information and possibly other information with the individual field data content. For example, the APC AA can use HTML formatted text in every field. The DC module can carry plain text only, so we have to use our own fields to carry formatted versions of these fields.
The RSS content module, http://purl.org/rss/1.0/modules/content/, supports a <content:items> element which allows transportation of the actual content using any format and encoding. This is useful for us to transport the item's text. We narrow this spec by stating which of the features in this module we want to use so the implementation does not get too complex.
This section defines the new RSS 1.0 module. It is hereby named "APC AA RSS module" or for short, "AA module".
The AA module must be declared in documents using the XML namespace syntax when using it. The URI to refer to the AA module is http://www.apc.org/rss/aa-module.html.
According to this specification, documents using the AA module can contain data for two different modes:
Within an RDF document, there are several spots where a resource - such as a channel, a category, a field or a news item - must be refered to using an unique identifier (URI). These URI do not necessarily refer to locations in the WWW. Whenever a URI is needed within this specification, the URI should consist of a valid domain name the sending host uses, followed by a unique number. This number will most likely be the AA internal id number everything has.
This specification specifies the following elements as the APC AA RSS module:
When using the AA module, the XML namespace declaration must be added to the rdf:RDF element. It must also include the namespace declarations for all modules used.
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:aa="http://www.apc.org/rss/aa-module.html"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns="http://purl.org/rss/1.0/">
Exactly as shown, aside from any additional namespace declarations
(channel+, image?, item+, textinput?)
Note that this is identical to RSS 1.0 with the exception that the channel element can appear multiple times. This is allowed only in feed establishing mode when the document is not RSS 1.0 compatible anyway.)
The obvious elements within the top level, all-enclosing <rdf:RDF> element are <channel> and <item> which are defined by RSS. The APC AA RSS module defines the following elements at that level:
The element carries a single category description. Each category used in the feed must go into its own <aa:category> element. The URI used in the element ({category_uri}) must be unique within the document. The URIs are refered in the channel's <aa:categories> listing element ( aa:categories in channel ) and in the item's <aa:categories> listing element ( aa:categories in item ). The category element contains three sub elements which indicate the category id, the parent category id (if any) and the literal category name.
<aa:category rdf:about="{category_uri}">
<aa:id>{category_id}</aa:id>
<aa:name>{category name}</aa:name>
<aa:catparent>{parent category id}</aa:catparent>
</aa:category>
required, one for each category used in the channel
<aa:category> ('<aa:category rdf:about="' #PCDATA '">'
'<aa:id>' #PCDATA '</aa:id>'
'<aa:name>' #PCDATA '</aa:name>'
catparent_spec?
'</aa:category>')
catparent_spec ('<aa:catparent>' #PCDATA
'</aa:catparent>')
<rdf:RDF> element
The element carries a single field description. Each field used in the feed must go into its own <aa:field> element. The URI used in the element ({field_uri}) must be unique within the document. The URIs are refered in the channel's <aa:fields> listing ( aa:fields ) as well as in the item's various data fields in <aa:field> references. The <aa:field> element contains sub elements which indicate the id and the literal name of the field.
<aa:field rdf:about="{field_uri}">
<aa:id>{field_id}</aa:id>
<aa:name>{field name}</aa:name>
</aa:field>
required, one for each field used in the channel
<aa:field> ('<aa:field rdf:about="' #PCDATA '">'
'<aa:id>' #PCDATA '</aa:id>'
'<aa:name>' #PCDATA '</aa:name>'
'</aa:field>')
<rdf:RDF> element
These elements appear within the <channel> element:
The element carries the timestamp of the newest item in the channel. The time must be given in UTC.
<aa:NewestItemTimestamp> YYYY-MM-DDTHH-MM-SSZ </aa:NewestItemTimestamp>
required
<aa:NewestItemTimestamp> (#PCDATA) as in W3C note daytime ( http://www.w3.org/TR/NOTE-datetime)
channel element
This element contains references to the categories used in a channel. It contains an RDF bag. Each member of the bag refers to one category. The categories themselves must be defined elsewhere in the document ( aa:category in rdf:RDF ).
<aa:categories><rdf:Bag>
<rdf:li rdf:resource="{category_uri}"/>
<rdf:li rdf:resource="{category_uri}"/>
...
</rdf:Bag></aa:categories>
optional
<aa:categories> ('<rdf:Bag>' category_reference+ '</rdf:Bag>')
category_reference ('<rdf:li rdf:resource="' #PCDATA '"/>')
channel element
This element contains references to the fields used in the slice or feed (channel). The fields must all be enumerated here so items can refer to them. All fields mentioned here must be actually defined in the same document (see below, aa:field in rdf:RDF ). For this purpose, each field must get its own unique URI which must at least be valid throughout this document. The URI should constst of the slice id and the field id which exist anyway in the AA. The fields are
<aa:fields><rdf:Bag>
<rdf:li rdf:resource="{field_uri}"/>
<rdf:li rdf:resource="{field_uri}"/>
...
</rdf:Bag></aa:fields>
required
<aa:fields> ('<rdf:Bag>' field_reference+ '</rdf:Bag>')
field_reference ('<rdf:li rdf:resource="' #PCDATA '"/>')
channel element
These elements appear within <item> elements. They mainly represent the content of the item while refering to categories and fields outside the item element.
The element indicates the categories this item belongs to. It refers to one or more of the categories defined in <aa:category> elements (see aa:category in rdf:RDF ).
The element works the same way as in <channel> elements. For the syntax, requirement, and model description see there: aa:categories in channel .
The element serves as a general purpose field data container. All field data is carried in this element. All field data that can be represented using dublin core elements must appear in a dublin core element, too - which means that some of the field data is duplicated. The container is an RDF bag which means it has no order, it is just a collection of things.
The container contains <aa:fielddata> elements which in turn contain the value of one field for this item each. The field is referenced by a <aa:field> element. An optional <aa:fieldflags> field can contain flags such as FLAG_FREEZE or others as defined elsewhere by the AA.
An optional <aa:format> element specifies the format used by the field content. It works the same way as the content module format element with the exception that is is optional. The empty element with a rdf:resource attribute references the format for the content. Possible formats are the same as for the content ( content:items ). It is possible to extend this in the future. The absence ot this element means that the content is plain text (the same as <aa:format rdf:resource="http://www.isi.edu/in-notes/iana/assignments/media-types/text/plain"/>).
The actual content of the field is stored in the <rdf:value> element. The format of the data must be as indicated by the aa:format element. If it is plain text, it must use the entity escape mechanism for the characters '<', '>', '&' and '"'. In case of non well formed data such as html, the xml CDATA mechanism should be used.
<aa:fielddatacont><rdf:Bag>
<rdf:li><aa:fielddata>
<aa:field rdf:resource="{field_uri}"/>
<aa:fieldflags>{field flags}</aa:fieldflags>
<aa:format rdf:resource="{format nature}"/>
<rdf:value>{field content}</rdf:value>
</aa:fielddata></rdf:li>
<rdf:li><aa:fielddata>
<aa:field rdf:resource="{field_uri}"/>
<aa:fieldflags>{field flags}</aa:fieldflags>
<aa:format rdf:resource="{format nature}"/>
<rdf:value>{field content}</rdf:value>
</aa:fielddata></rdf:li>
...
</rdf:Bag></aa:fielddatacont>
required for each field
<aa:fielddatacont> ('<aa:fielddatacont><rdf:Bag>'
fielddata_item+
'</rdf:Bag></aa:fielddatacont>')
fielddata_item ('<rdf:li><aa:fielddata>'
field_reference fielddata_flags?
fielddata_format?
fielddata_value
'</aa:fielddata></rdf:li>')
field_reference ('<aa:field rdf:resource="' #PCDATA
'"/>')
fielddata_flags ('<aa:fieldflags>' #PCDATA
'</aa:fieldflags>')
fielddata_format ('<aa:format rdf:resource="' #PCDATA
'"/>')
fielddata_value ('<rdf:value>' #PCDATA
'</rdf:value>')
item element
The data that describes the feed or slice contains three RDF containers. One of them is defined in RSS 1.0; it contains the items. The other two are defined by the AA module; they contain the slice's categories and fields, respectively.
For containers, RDF offers two concepts to represent items in the containers: As references items and as inline items. This specification only allows RDF containers containing references items within channels. This means that the containers only contain references to the actual items. The references are made up using rdf:resource attributes, and the actual items must appear elsewhere in the document.
The slice of feed data is carried as follows:
The categories referenced in the aa:categories in channel container must appear in the document data as follows:
The fields referenced in the aa:fields container must appear in the document data as follows:
The data that describes the actual items is carried as follows:
http://www.isi.edu/in-notes/iana/assignments/media-types/text/html/.
The actual html data in the rdf:value element must
be quoted useing the <![CDATA[...]]>
mechanism;http://www.isi.edu/in-notes/iana/assignments/media-types/text/plain.
The actual text data in the rdf:value element must
use the entity escape mechanism for the characters
'<', '>', '&', ''' and '"'
(< and so on).Recommended best practice for the values of the Language element is defined by RFC 1766 [RFC1766] which includes a two-letter Language Code (taken from the ISO 639 standard [ISO639]), followed optionally, by a two-letter Country Code (taken from the ISO 3166 standard [ISO3166]). For example, 'en' for English, 'fr' for French, or 'en-uk' for English used in the United Kingdom.
The encoding in use for a slice within apcrss messages will match the encoding of the slice data.
Within the text fields, the characters '<', '>' and '&' must be encoded using the existing XML mechanisms ('<', '>', '&').
Date and time values are encoded using the w3c note http://www.w3.org/TR/NOTE-datetime.
Examples of RSS 1.0 / AA module data files are