|Up: Code||[Related] «^» «T»|
Sunday, May 11, 2003
By Paul Ford
Notes from the hypertext sweatlodge.
I am rewriting the code for Ftrain.com, so that it is generalized and easy to set up elsewhere. This is a rough start at defining that rewrite.
What concepts define the Ftrain.com SiteKit?
There are 4 concepts: the Arb, the Link, the Path, and the View.
- An Arb is an arbitrary block of content that can contain other Arbs. The collection of Arbs is the Default Hierarchy.
- A Link connects one Arb to another.
- An Arb that contains more than one link forms a Path.
- A View is a way of looking at a set of Arbs that excludes 0 or more Arbs and all of their descendants from the set.
The default hierarchy of Arbs forms a taxonomy, which can be exported to the OWL ontology definition language, which is in turn expressed in RDF, but that is not important right now. What's important right now is that you hug your kids.
What does an Arb look like?
No one knows. It is, after all, arbitrary. But some have suggested one might look like this:
<f:arb display="Poem" id="AfterDinner"> <f:title>After Dinner</f:title> <f:description>The beginnings of a poem by Paul Ford.</f:description> <f:added time="2003-05-11-22-00"/> <f:originally time="2003-05-11-22-00"/> <f:publish time="2003-05-12-00"/> <f:expire time="2003-05-12-00"/> <f:link has="Author" ref="PaulFord"/> <f:link has="Copyright" ref="FtrainCopyright"/> <f:link has="Form" ref="Poem"/> <f:content> <f:arb display="Stanza"> <f:l>The <f:link ref="Machines">machines</a> in the air,</f:l> <f:l>They make them just like me,</f:l> <f:l>And drop us out</f:l> <f:l>without a parachute.</f:l> </f:arb> <f:space lines="10"/> </f:content> </f:arb>
What the hell is going on there?
<f:arb display="Poem" id="AfterDinner">
This is an arbitrary unit of content which should be displayed in the way that the system formats poems. It has an id, AfterDinner, which is shorthand for http://www.ftrain.com/AfterDinner.html. With poems, the system creates a new page, numbers the lines of the poem, and takes other actions.
This is the title that will be displayed for the poem.
<f:description>The beginnings of a poem by Paul Ford.</f:description>
This is the description that will be displayed for the poem.
This is the time the poem was added to the set of Arbs.
This is the time the poem was brought into the world (if this was a sonnet by Shakespeare, it would list a year in the 1500s).
Whenever the publishing system publishes the collection of arbs, and it's later than the time shown here, publish the arb. If it's before that time, don't publish it.
Don't publish the Arb if the current time is later than the expiration time.
<f:link has="Author" ref="PaulFord"/>
The author for this Arb is the Arb with the id equal to PaulFord. This assumes that there exists an Arb somewhere within the set of arbs with the id PaulFord. The title of that Arb will be displayed according to the logic in the system that id defined for displaying links of type Author.
<f:link has="Copyright" ref="FtrainCopyright"/>
The copyright for this Arb is the Arb with the id equal to FtrainCopyright.
<f:link has="Form" ref="Poem"/>
The form for this Arb is the Arb with the id equalt to Poem.
The content is a mixture of XHTML (probably XHTML2.0) and Ftrain XML.
This is an arbitrary unit of content which should be displayed in the way that the system formats a Stanza.
<f:l>The <f:link ref="Machines">machines</f:link> in the air,</f:l>
A line of text inside a stanza. This includes a link to the Machines arb, which will be displayed with the text machines. In the Machines Arb, the fact that this line/stanza/poem links to it will be registered.
<f:l>They make them just like me,</f:l>
<f:l>And drop us out</f:l>
<f:l>without a parachute.</f:l>
And then we close it all down.
Are you serious? You expect me to care?
This is how it works: you build a taxonomy of Arbs, that is, Arbs-inside-Arbs, and within the Arbs you have Links, sometimes typed by their role (i.e. this is an Author), sometimes not. Every thing on the site is represented by an Arb. By creating different “views” of the Arbs, using different sorting criteria, you generate calendars, timelines, tables of contents, indices of first lines, etc. These are themselves Arbs, and are drawn back into the Default Arb Hierarchy. When you have more than one link per Arb, as you do in the different Views, you're describing a Path—a way to navigate through the Arbs. The system will give you hints on how to navigate these paths, or most of them. Just as it allows you to navigate by time, or by hierarchy, today.
Can you explain further?
Yes, but it's late and what a weekend.
Will it work?
Sure; it's just a highly abstract version of the tomfoolery that already lives below Ftrain, with the concept of a hierarchy of nodes replaced by an explicit taxonomy, one concept per node.
What's the point?
The point is to connect Web publishing with the concept of the ontology, so that content, added over time, forms a richly linked knowledge base which is addressable by anywhere else on the Web, logical to browse, and available for processing by rule-based languages like Prolog, and easy to connect with other data, like the Open Directory, publicly available timelines, and other sites which export data in a format compatible with the Ftrain SiteKit.
Because the Default Arb Hierarchy can be exported as OWL, and OWL allows rdf:About statements to exist concerning resources, which serve to annotate, it should be possible to provide resources for ontology-driven reasoning regarding the site and sites which also share information in a similar format. This will allow one to say, for instance, “show me all the poets on the site”, and since the system knows, or rather, can be told that individual authors of Arbs of the form Poem are Poets, it can list all the poets on the site without forcing the site creator to explicitly name an author as a Poet. This is useful in cases like Jorge Luis Borges, who is both poet and prose writer. Classifying him in the Taxonomy as one or the other would be mistaken; you can't classify him as both or else you have two Borges nodes. The right answer, then, is not to classify Borges in any more granular way than as a human, but to deduce his role from the sorts of things he writes.
So the SiteKit handles the documents and connections between them. Other tools handle other aspects: web forms and text editors manage content creation; CVS manages version control.
When will it be done?
Proof of concept around June 1, with a documented rollout by July 1.
That's it? That made no sense.
I know, but you have to start somewhere.