<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>In my &#60;element/&#62;</title>
	<atom:link href="http://blogs.oucs.ox.ac.uk/jamesc/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.oucs.ox.ac.uk/jamesc</link>
	<description>Work-Related Unkempt Thoughts</description>
	<lastBuildDate>Thu, 01 Dec 2011 21:27:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.5</generator>
		<item>
		<title>@rend and the war on text-bearing attributes</title>
		<link>http://blogs.oucs.ox.ac.uk/jamesc/2011/12/01/rend-and-the-war-on-text-bearing-attributes/</link>
		<comments>http://blogs.oucs.ox.ac.uk/jamesc/2011/12/01/rend-and-the-war-on-text-bearing-attributes/#comments</comments>
		<pubDate>Thu, 01 Dec 2011 21:25:44 +0000</pubDate>
		<dc:creator>James Cummings</dc:creator>
				<category><![CDATA[TEI]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://blogs.oucs.ox.ac.uk/jamesc/?p=280</guid>
		<description><![CDATA[In discussing that the TEI attribute @rend from att.global although it allows you to type just about anything in it, doesn&#8217;t actually allow anything more that a set of single tokens. I recently explained to John, Paul, George, or Ringo &#8230; <a href="http://blogs.oucs.ox.ac.uk/jamesc/2011/12/01/rend-and-the-war-on-text-bearing-attributes/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>In discussing that the <a href="http://www.tei-c.org/">TEI</a> attribute @rend from <a href="http://www.tei-c.org/Vault/P5/current/doc/tei-p5-doc/en/html/ref-att.global.html">att.global</a> although it allows you to type just about anything in it, doesn&#8217;t actually allow anything more that a set of single tokens. I recently explained to John, Paul, George, or Ringo (can&#8217;t remember which), that it really doesn&#8217;t mean that spaces are allowed, simply that whitespace is the delimiter in the attribute value.</p>
<p>The definition of @rend is &#8220;(rendition) indicates how the element in question was rendered or presented in the source text.&#8221; but it is very often used by some encoders to signal to processing how you want the <strong>output </strong>to appear.  In the remarks on the values allowed for the attribute it says:</p>
<blockquote><p>may contain any number of tokens, each of which may contain letters, punctuation marks, or symbols, but not word-separating characters.</p></blockquote>
<p>The point here being the &#8216;word-separating characters&#8217; part. So although you can say &lt;hi rend=&#8221;It looks a bit like that other one&#8221;&gt;text&lt;/hi&gt;, this actually has 8 tokens &#8220;It&#8221;, &#8220;looks&#8221;, &#8220;a&#8221;, &#8220;bit&#8221;, &#8220;like&#8221;, &#8220;that&#8221;, &#8220;other&#8221;, &#8220;one&#8221;. Sometimes people stick CSS or CSS-like rendition information into @rend so have values like &#8220;text-align: right&#8221;. Which I would say was wrong&#8230; or at least saying that there are two classifications applicable to its rendition in the source material, one that it is &#8220;text-align:&#8221; and another that it is &#8220;right&#8221;.  Of course they could solve this just be deleting the space &#8220;text-align:right&#8221; would be better, or even &#8220;text-align:right; font-size:large;&#8221; if you wanted to add another token.  However, even better would be to use @rendition to point to at least one @xml:id of a &lt;rendition&gt; element in the header.  This allows you to specify exactly what scheme you are using (e.g. CSS) and to give multiple statements for one classification.</p>
<p>Why does this matter you might ask? Well, of course, it doesn&#8217;t really &#8212; they are all magic tokens of one sort or the other to be interpreted (or not) by your processing for whatever reason you are undertaking the encoding. The &lt;rendition&gt; method is the most detailed in documenting precisely how you are interpreting the rendition in the original document.</p>
<p>However, the reason it matters to me is that there are NO attributes in the TEI which allow free-text.</p>
<p>By that I mean that all attributes are assigned to one datatype or another, and in none of them can you just type sentences of prose and have it be semantically meaningful.  This is as a result of the long <strong>War on Text-Bearing Attributes </strong>that was undertaken in the run-up to the first release of TEI P5. This took as one of its many principles that because <strong>any</strong> bit of free text <em>might</em> have a need to use a non-Unicode character, and that the TEI&#8217;s method for documenting non-Unicode characters was to use its &lt;<a href="http://www.tei-c.org/Vault/P5/current/doc/tei-p5-doc/en/html/ref-g.html">g</a>&gt; element, that you couldn&#8217;t have free-text attributes because you can&#8217;t use an element inside an attribute value. This is the reason for the creation of many new child elements like &lt;desc&gt; which are intended to contain free text concerning the nature of the element that contains them.</p>
<p>In the case of the @rend attribute it allows one to infinity of the <a href="http://www.tei-c.org/Vault/P5/current/doc/tei-p5-doc/en/html/ref-data.word.html">data.word</a> datatype.  This data type, even in <a href="http://www.tei-c.org/Vault/P5/1.0.0/doc/tei-p5-doc/en/html/ref-data.word.html">P5 1.0.0</a> &#8220;defines the range of attribute values expressed as a single word or token.&#8221;  Thus when people put space separated characters into it, they are really putting in multiple tokens.  The war of text-bearing attributes attempted to limit the places where people were able to do this by the use of <a href="http://www.tei-c.org/Vault/P5/1.0.0/doc/tei-p5-doc/en/html/REF-MACROS.html">datatypes</a> and the removal of free text in attribute values.</p>
<p>This helps to highlight the difference between syntactic and semantic validity. Just because your document validates against a schema, does not mean that it is semantically valid.  You can put the text of a title inside an &lt;<a href="http://www.tei-c.org/Vault/P5/current/doc/tei-p5-doc/en/html/ref-author.html">author</a>&gt; element and vice-versa and there is no way your schema can know that you have done this.</p>
<p>So really, I&#8217;ve posted this post so I can point to it later when people ask me about spaces in @rend and similar datatype kerfuffles.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.oucs.ox.ac.uk/jamesc/2011/12/01/rend-and-the-war-on-text-bearing-attributes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Is it Bill or Ben that is speaking of flowerpot men?</title>
		<link>http://blogs.oucs.ox.ac.uk/jamesc/2011/11/03/billorben/</link>
		<comments>http://blogs.oucs.ox.ac.uk/jamesc/2011/11/03/billorben/#comments</comments>
		<pubDate>Thu, 03 Nov 2011 23:02:42 +0000</pubDate>
		<dc:creator>James Cummings</dc:creator>
				<category><![CDATA[TEI]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://blogs.oucs.ox.ac.uk/jamesc/?p=251</guid>
		<description><![CDATA[A friend asked a question about how to encode a dramatic speech that possibly should be considered two speeches. Owing to a printing mistake, the second speaker&#8217;s name was omitted, so some consider it a single speech by the first &#8230; <a href="http://blogs.oucs.ox.ac.uk/jamesc/2011/11/03/billorben/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A friend asked a question about how to encode a dramatic speech that possibly should be considered two speeches. Owing to a printing mistake, the second speaker&#8217;s name was omitted, so some consider it a single speech by the first speaker. However, a later hand has added the second speaker&#8217;s name in the margin after the fact, so some may wish to understand it as two speeches.  The question was how do you encode these two possibilities simultaneously.  Of course an entire stand-off solution is possible where you just mark the words and simultaneously mark word 1 to 20 as belonging to one speaker and the other speaker. But ignoring that more complicated solution here is some of the thinking I went through.</p>
<p>Let&#8217;s say we have some play, where Bill has two paragraphs. In the first he says &#8220;Bill and Ben, Bill and Ben,&#8221; and in the second he says &#8220;Bill and Ben, Bill and Ben, flowerpot men&#8221;. In <a href="http://www.tei-c.org/">TEI</a> we might encode this as:</p>
<pre class="brush: xml;">
&lt;!-- bill is speaker --&gt;
&lt;sp who=&quot;#bill&quot;&gt;
   &lt;!-- #bill points to more information about this speaker somewhere else in the document --&gt;
  &lt;speaker&gt;Bill&lt;/speaker&gt;
   &lt;p&gt;Bill and Ben, Bill and Ben,&lt;/p&gt;
   &lt;p&gt;Bill and Ben, Bill and Ben, flowerpot men&lt;/p&gt;
&lt;/sp&gt;
</pre>
<p>Now let&#8217;s say that the speaker marker &#8216;Bill&#8217; was there and it had been crossed out by a later hand and replaced by &#8216;Ben&#8217;.  We could indicate who we thought the real speaker was with the @who attribute whilst still retaining the orthographic distinction that a substitution had been made inside the &lt;speaker&gt;  element.</p>
<pre class="brush: xml;">
&lt;!-- Ben is speaker but a substitution noted--&gt;
&lt;sp who=&quot;#ben&quot;&gt;
 &lt;speaker&gt;
    &lt;subst&gt;
      &lt;del&gt;Bill&lt;/del&gt;
      &lt;add&gt;Ben&lt;/add&gt;
    &lt;/subst&gt;
  &lt;/speaker&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben,&lt;/p&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben, flowerpot men&lt;/p&gt;
&lt;/sp&gt;
</pre>
<p>But this means we have to make the editorial decision, for all outputs, that one of them (here &#8216;Ben&#8217;) is the speaker.  Another similar type of occurrence might be when Bill and Bill both say the paragraphs at the same time.  In this case, we just note both of them as speakers:</p>
<pre class="brush: xml;">
&lt;!-- bill and ben are both simultaneously speakers--&gt;
&lt;sp who=&quot;#bill #ben&quot;&gt;
  &lt;speaker&gt;Bill&lt;/speaker&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben,&lt;/p&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben, flowerpot men&lt;/p&gt;
&lt;/sp&gt;
</pre>
<p>Similar to this, is the case where the entire speech is spoken by either Bill or Ben, but the text just says Bill. In this case one solution (of a number of them) is not to post to a &lt;person&gt; element but instead point to a &lt;listPerson&gt; identified as &#8216;billOrBen&#8217;.  Then in processing we can choose to assign this to one or the other, even though the text still says &#8216;Bill&#8217;.  We&#8217;ve documented that we can only have one of them by using the @exclude attribute to point to the other &lt;person&gt; element.</p>
<pre class="brush: xml;">
&lt;!-- billOrBen listPerson is speaker, but contents are mutually exclusive, so sort out in processing --&gt;
&lt;sp who=&quot;#billOrBen&quot;&gt;
  &lt;speaker&gt;Bill&lt;/speaker&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben,&lt;/p&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben, flowerpot men&lt;/p&gt;
&lt;/sp&gt;

&lt;listPerson xml:id=&quot;billOrBen&quot;&gt;
  &lt;person xml:id=&quot;bill&quot; exclude=&quot;#ben&quot;&gt;&lt;persName&gt;Bill&lt;/persName&gt;&lt;/person&gt;
  &lt;person xml:id=&quot;ben&quot; exclude=&quot;#bill&quot;&gt;&lt;persName&gt;Ben&lt;/persName&gt;&lt;/person&gt;
&lt;/listPerson&gt;
</pre>
<p>But in the case that I was asked about the speaker&#8217;s name is added partway through a speech. Now, one way to deal with this is just to say the &#8216;Bill&#8217; is the speaker, and the name &#8216;Ben&#8217; is just an addition in the text. There is nothing wrong with this, you&#8217;re just documenting the original printing and the addition of the new name, but not changing the structure of the text.</p>
<pre class="brush: xml;">
&lt;!-- bill is speaker but addition of  name partway through noted --&gt;
&lt;sp who=&quot;#bill&quot;&gt;
  &lt;speaker&gt;Bill&lt;/speaker&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben,&lt;/p&gt;
  &lt;p&gt;&lt;add place=&quot;left&quot;&gt;&lt;name&gt;Ben&lt;/name&gt;&lt;/add&gt;Bill and Ben, Bill and Ben, flowerpot men&lt;/p&gt;
&lt;/sp&gt;
</pre>
<p>The other option, of course, would be to understand the intellectual content of the addition as splitting the two speeches, and encode not the original printed work, but the final version with the editorial additional provided by a later hand.  (So this would just be ).</p>
<pre class="brush: xml;">
&lt;!-- bill is speaker but addition of  name partway through noted --&gt;
&lt;sp who=&quot;#bill&quot;&gt;
  &lt;speaker&gt;Bill&lt;/speaker&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben,&lt;/p&gt;
&lt;/sp&gt;

&lt;sp who=&quot;#ben&quot;&gt;
   &lt;speaker rend=&quot;left&quot;&gt;Ben&lt;/speaker&gt;
   &lt;p&gt;Bill and Ben, Bill and Ben, flowerpot men&lt;/p&gt;
&lt;/sp&gt;
</pre>
<p>But that isn&#8217;t really what was asked for&#8230; this says that there are two speeches, and while they want to have this as a possibility, they also want to record that it is possible that the &#8216;flowerpot men&#8217; paragraph was actually said by &#8216;Bill&#8217; and this &#8216;Ben&#8217; in the margin is just an addition. One way to do this is to use the @exclude attribute again and to do so at slightly different levels of granularity.</p>
<pre class="brush: xml;">
&lt;!-- bill speaks first bit, and possibly second bit, but possibly ben speaks second bit --&gt;
&lt;sp who=&quot;#bill&quot;&gt;
  &lt;speaker&gt;Bill&lt;/speaker&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben,&lt;/p&gt;
  &lt;p exclude=&quot;#benPara&quot; xml:id=&quot;billPara&quot;&gt;
    &lt;add place=&quot;left&quot;&gt;&lt;name&gt;Ben&lt;/name&gt;&lt;/add&gt;
     Bill and Ben, Bill and Ben, flowerpot men&lt;/p&gt;
&lt;/sp&gt;

&lt;sp who=&quot;#ben&quot; exclude=&quot;#billPara&quot;&gt;
  &lt;speaker rend=&quot;left&quot;&gt;Ben&lt;/speaker&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben, flowerpot men&lt;/p&gt;
&lt;/sp&gt;
</pre>
<p>In this case we&#8217;re saying that the second paragraph of Bill&#8217;s speech is mutually exclusive with the whole speech by Ben. In processing for any particular output we need to decide how to handle this, do we have the speech by Bill (which has the addition of a name to the left of the second paragraph) or do we have the speech by Bill consisting of only the first paragraph, and a speech by Ben.</p>
<p>Another way to do this is to use the &lt;alt&gt; element to record this elsewhere. In this case you just need to make sure there are proper @xml:id attributes on all the elements you want to point to, so here &#8216;billPara2&#8242; is the second paragraph of Bill&#8217;s speech, and &#8216;benPara2&#8242; is the whole of Ben&#8217;s speech.  We then use the &lt;alt&gt; element to say that these two IDs are mutually exclusive, and specifically that we think it 70% likely that &#8216;billPara2&#8242; is the correct one to choose and only 30% that &#8216;benPara2&#8242; should be the correct choice.</p>
<pre class="brush: xml;">
&lt;!-- bill speaks first bit, and possibly second bit, but (less) possibly ben speaks second bit  stand-off alternation--&gt;
&lt;sp who=&quot;#bill&quot;&gt;
  &lt;speaker&gt;Bill&lt;/speaker&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben,&lt;/p&gt;
  &lt;p xml:id=&quot;billPara2&quot;&gt;&lt;add place=&quot;left&quot;&gt;Ben&lt;/add&gt;Bill and Ben, Bill and Ben, flowerpot men&lt;/p&gt;
&lt;/sp&gt;

&lt;sp who=&quot;#ben&quot; xml:id=&quot;benPara2&quot;&gt;
  &lt;speaker rend=&quot;left&quot;&gt;Ben&lt;/speaker&gt;
  &lt;p&gt;Bill and Ben, Bill and Ben, flowerpot men&lt;/p&gt;
&lt;/sp&gt;

&lt;alt mode=&quot;excl&quot; targets=&quot;#billPara2 #benPara2&quot; weights=&quot;0.7 0.3&quot;/&gt;
</pre>
<p>It is important to note that all of this is just a way to document whichever interpretation the encoder wishes to record.  I&#8217;m not aware of any off-the-shelf processing which will do anything with @exclude or &lt;alt&gt; elements, however, I can picture that doing this in XSLT would not necessarily be too onerous depending on what circumstances it is used.</p>
<p>Oh, and obviously the original enquiry did not use a play based on the Bill and Ben theme song, but a much more famous Renaissance poet and playwright.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.oucs.ox.ac.uk/jamesc/2011/11/03/billorben/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>TEI P4 Support, Survey Results</title>
		<link>http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/</link>
		<comments>http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/#comments</comments>
		<pubDate>Thu, 27 Oct 2011 15:00:48 +0000</pubDate>
		<dc:creator>James Cummings</dc:creator>
				<category><![CDATA[TEI]]></category>

		<guid isPermaLink="false">http://blogs.oucs.ox.ac.uk/jamesc/?p=216</guid>
		<description><![CDATA[Introduction This post contains the results of a survey that  collected information which the TEI Technical Council will use to assess the need for ongoing support for the TEI P4 version of its Guidelines. These have largely been replaced by the TEI P5 &#8230; <a href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p>This post contains the results of a survey that  collected information which the <a href="http://www.tei-c.org/">TEI</a> Technical Council will use to assess the need for ongoing support for the <a href="http://www.tei-c.org/Guidelines/P4/">TEI P4</a> version of its Guidelines. These have largely been replaced by the <a href="http://www.tei-c.org/release/doc/tei-p5-doc/en/html/index-toc.html">TEI P5 Guidelines</a> since November 2007. At that point it was promised that support would continue for TEI P4 for 5 years, until November 2012. As that is just over a year away we are starting a slow process of phasing out support for the TEI P4 Guidelines. The TEI Technical Council is planning to de-emphasize the appearance of TEI P4 as an offering since support for it will be ending in November 2012. We will continue to support it over the next year but may take steps to stop it being indexed by search engines or make it less prominent on the website. These are the results of this survey, which I&#8217;ve also transformed to TEI P5 XML at <a href="http://users.ox.ac.uk/~jamesc/SurveySummary.tei.xml">http://users.ox.ac.uk/~jamesc/SurveySummary.tei.xml</a>.</p>
<h3>1. <span>Are you involved with projects that are still using TEI P4?</span></h3>
<div id="attachment_217" class="wp-caption aligncenter" style="width: 756px"><a rel="attachment wp-att-217" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/survey1/"><img class="size-full wp-image-217" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/survey1.png" alt="" width="746" height="814" /></a><p class="wp-caption-text">Answers for Question 1</p></div>
<p>My reading of these results is that many people are either not using TEI P4, or planning to migrate it to TEI P5. I suspect, given the other answers that those with TEI P4 projects probably do not rely on a lot of support from the TEI Consortium.</p>
<h3><abbr title="Question 2">2</abbr>. How important is ongoing TEI P4 support to you?</h3>
<div id="attachment_218" class="wp-caption aligncenter" style="width: 748px"><a rel="attachment wp-att-218" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/survey2/"><img class="size-full wp-image-218" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/survey2.png" alt="" width="738" height="494" /></a><p class="wp-caption-text">Answers to question 2</p></div>
<p style="text-align: center"><a rel="attachment wp-att-239" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/chart2-2/"><img class="aligncenter size-full wp-image-239" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/Chart21.png" alt="" width="800" height="600" /></a></p>
<p>This seems fairly clear: out of 54 respondents 44 said it was not important, unnecessary or that we should get rid of it. But that it is important or very important for 18.5% of respondents is still significant and must be remember when making decisions concerning ongoing support for TEI P4.</p>
<h3><abbr title="Question 3">3</abbr>. How much should the TEI Consortium begin to de-emphasize TEI P4 on its website before November 2012?</h3>
<div id="attachment_219" class="wp-caption aligncenter" style="width: 758px"><a rel="attachment wp-att-219" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/survey3/"><img class="size-full wp-image-219" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/survey3.png" alt="" width="748" height="608" /></a><p class="wp-caption-text">Answers to question 3</p></div>
<p>There seems to be a strong vote for making TEI P4 available only from the TEI Vault and making sure existing links redirect.</p>
<h3><abbr title="Question 4">4</abbr>. Should search engines be dissuaded from index TEI P4 materials?</h3>
<div id="attachment_220" class="wp-caption aligncenter" style="width: 766px"><a rel="attachment wp-att-220" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/survey4/"><img class="size-full wp-image-220" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/survey4.png" alt="" width="756" height="540" /></a><p class="wp-caption-text">Answers to question 4</p></div>
<p>This result is less clear cut with some people feeling it shouldn&#8217;t be indexed, and some people thinking it should be (with slightly more weight on it being indexed than not indexed).</p>
<h3><abbr title="Question 5">5</abbr>. Approximately how many TEI P4 projects have you been involved with?</h3>
<div id="attachment_221" class="wp-caption aligncenter" style="width: 756px"><a rel="attachment wp-att-221" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/survey5/"><img class="size-full wp-image-221" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/survey5.png" alt="" width="746" height="444" /></a><p class="wp-caption-text">Answers to question 5</p></div>
<p>This is simply a statistical question (and of course depends how the respondent interprets &#8216;projects&#8217;). It is interesting that the majority of people seem to be involved with more than one project, but that is hardly unexpected. More were involved with 6-15 projects than I thought.</p>
<h3><abbr title="Question 6">6</abbr>. Approximately how many TEI P5 projects have you been involved with?</h3>
<div id="attachment_222" class="wp-caption aligncenter" style="width: 756px"><a rel="attachment wp-att-222" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/survey6/"><img class="size-full wp-image-222" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/survey6.png" alt="" width="746" height="440" /></a><p class="wp-caption-text">Answers to question 6</p></div>
<p>It is interesting that the percentages are vaguely the same as with TEI P4 projects, though slightly higher overall.</p>
<h3><abbr title="Question 7">7</abbr>. What amount of TEI P4 data do your projects have? (In documents, number of files, how many megabytes, or whatever convenient measure makes sense for your project)</h3>
<p>This was a textual question, attempting to get a measure of how much TEI P4<strong> </strong>stuff people have. It was deliberately left vague as to how it should be expressed, partly because I was interested to see how people would quantify their TEI P4 data, and partly because I recognise that it would be difficult to provide all the same form of measurement.  I was interested to see that this ranged more widely than I had expected.</p>
<div>
<ul>
<li>0</li>
<li>none</li>
<li>zero</li>
<li>Several hundred files.</li>
<li>I have about 500 texts</li>
<li>3,200 files, 170Mb.</li>
<li>nil</li>
<li>Very roughly: 60,000 books = 5 million pages = 10 GB of marked-up text.</li>
<li>40 megabytes in the one P4 project I still manage; a bunch more in ones I&#8217;m no longer involved in.</li>
<li>This varies a lot, but projects range from 3-150 MB In practice, the TEI files are a small part of the overall operation, which includes authority information usually in non-TEI format, and various generated TEI XML files used for web publication only</li>
<li>50 files</li>
<li>Appx. 7000 files, 29 MB total data</li>
<li>Appr. 6500 documents (mostly letters)</li>
<li>0</li>
<li>less than 10%</li>
<li>0</li>
<li>about 3,000 XML files currently in P4.</li>
<li>in summa: about 4 Mb</li>
<li>All of the [Institution]&#8216;s projects are in migration from p4 to p5, so this is a snapshot of the migration process. The data is migrated, but the sites are not all rewritten yet. My hope is that by May of 2012, all of the current [Institution] sites will be serving out texts based on p5.</li>
<li>0</li>
<li>Help files used by about 1000 Modes users.</li>
<li>5 text-critical editions</li>
<li>7000+ [P4 Customization] encoded letters</li>
<li>Main current project: several dozen megabytes including a few large files but mostly 10-20 kb: roughly 3000 files.</li>
<li>Roughly twelve published electronic editions, with at least a dozen more in the pipeline, in process of being finished (though they now have to be migrated to be published).</li>
<li>I have no clue, but it&#8217;s a lot.</li>
<li>The [Institution] has 113MB bytes of P4 documents, of archival interest only.</li>
<li>None, since we upgraded.</li>
<li>I&#8217;m not sure. I think I might have one project that is in TEI P4, but it&#8217;s a legacy project and I&#8217;m actually not positive. I haven&#8217;t looked at it in a while.</li>
<li>2.5 million text pages</li>
<li>zero</li>
<li>None</li>
<li>Between 300 and 600 files.</li>
<li>ca. 70 files</li>
<li>dozens of documents.</li>
<li>Lots. Can&#8217;t access the figures quickly.</li>
<li>700MB</li>
</ul>
</div>
<p>This ranges from zero to multiple gigabytes of TEI text. What I should have asked was &#8220;And is all the TEI freely available for download?&#8221; as, of course, that is something I&#8217;d like to encourage.</p>
<h3><abbr title="Question 8">8</abbr>. Please list the URLs of any TEI P4 projects you want us to know about.</h3>
<p>I&#8217;ve decided not to provide these on this summary, if projects wish to provide samples they should add them to  <a href="http://wiki.tei-c.org/index.php/Samples">http://wiki.tei-c.org/index.php/Samples</a> and/or describe their projects on the wiki.</p>
<h3><abbr title="Question 9">9</abbr>. Please list the URLs of any TEI P5 projects you want us to know about.</h3>
<p>I&#8217;ve decided not to provide these on this summary, if projects wish to provide samples they should add them to  <a href="http://wiki.tei-c.org/index.php/Samples">http://wiki.tei-c.org/index.php/Samples</a> and/or describe their projects on the wiki.</p>
<h3><strong><abbr title="Question 10">10</abbr>. Have you submitted a Bug or Feature Request to the TEI Technical Council?</strong></h3>
<div id="attachment_223" class="wp-caption aligncenter" style="width: 774px"><a rel="attachment wp-att-223" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/survey10/"><img class="size-full wp-image-223" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/survey10.png" alt="" width="764" height="922" /></a><p class="wp-caption-text">Answers to question 10</p></div>
<p><a rel="attachment wp-att-240" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/chart10/"><img class="aligncenter size-full wp-image-240" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/Chart10.png" alt="" width="800" height="600" /></a></p>
<p>Lots of people have provided bug or feature requests,  but most people have either contributed to discussion or not contributed them. We should, of course, strive to increase feedback from the TEI community. I&#8217;d be interested in any ideas on how to make this easier for the community to participate.</p>
<h3><abbr title="Question 11">11</abbr>. Where do you think the TEI Technical Council should expend its time and effort?</h3>
<div id="attachment_224" class="wp-caption aligncenter" style="width: 758px"><a rel="attachment wp-att-224" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/survey11/"><img class="size-full wp-image-224" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/survey11.png" alt="" width="748" height="910" /></a><p class="wp-caption-text">Answers to question 11</p></div>
<p><a rel="attachment wp-att-241" href="http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/chart11/"><img class="aligncenter size-full wp-image-241" src="http://blogs.oucs.ox.ac.uk/jamesc/files/2011/10/Chart11.png" alt="" width="800" height="600" /></a></p>
<p>This is also an interesting result.  Scoring highest on &#8216;top priority&#8217; is the idea that the TEI Technical Council should spend its time fixing bugs and implementing feature requests by the community. This, and analysing where the TEI Guidelines could be improved and undertaking these improvements was also ranked highly, along with developing the infrastructural basis for future versions of the TEI Guidelines. What  scored lower was the idea of the TEI Technical Council setting up a repository of TEI texts, or developing software to make publication of TEI texts easier. I would suspect that this is because that maintaining the Guidelines is the central mandate of the TEI Technical Council, and looking for how it can be improved is related to that, while the creating of repositories is already done better by people who already focus on those activities.  Although it is a community-based activity only the TEI is really in charge of maintaining the Guidelines, whereas any third-party can develop software or archives.  We should certainly encourage those activities and implement community suggestions which facilitate the greater development of community software.</p>
<h3><abbr title="Question 12">12</abbr>. Any other comments?</h3>
<div>Here are the comments that I received (lightly edited), with my personal responses:</div>
<blockquote>
<div>For people with large repositories of transcriptions (where the text content will never be updated), markup stability is essential. P4 to P5 is not essential but recommended, but it&#8217;s going to mean a huge effort. My worry is that there will be a far too rapid succession to P6, P7, P8, etc which adds bells and whistles but does not contribute anything meaningful to static repositories.</div>
</blockquote>
<div>There is not necessarily any reason to migrate if your systems are set up and working fine with P4. I would, personally, recommend using P5 in any new project.  And then you probably reach a state where it is easier to migrate the P4 to P5 than support multiple systems, but different people&#8217;s experiences will vary.  The Birnbaum Doctrine suggested that the TEI Council should only move to new major versions (P6 etc.) when a large external technological change meant that it would be beneficial (e.g. SGML to XML) or a large internal infrastructural change (e.g. development of the P5 class system) was deemed significantly beneficial. I personally do not believe that we are at a juncture which would necessitate development of P6, rather I&#8217;d prefer to see P5 2.5, P5 4.5, P5 35.2, etc. than have people feel they need to move major versions.  This has its own challenges, of course, and your project in its TEI ODD can point to the very specific version of TEI P5 that it uses.</div>
<blockquote>
<div>Yes &#8211; thanks for doing such a great service to the community!</div>
</blockquote>
<div>You&#8217;re welcome, it was my pleasure. Although I know filling in surveys can be annoying I think it is a quick and easy way to get at least a vague indication of the community&#8217;s feeling on certain issues.</div>
<blockquote>
<div>I think that lack of easy tools for presentation / publication od TEI documents is a serious drawback. Many of my younger colleagues would learn (or actually have learned) the TEI editing in Oxygen, but they are unable &#8212; and not willing! &#8212; to learn XSLT for the presentation of their texts (not to mention the publication &#8211; servers etc.). An average user who is not able to modify Sebastian&#8217;s stylesheets for his edition is left completely alone with his/her TEI document (only *exceptionally*, an XSL-expert is available for help in big institutions). As for now, the TEI is an ideal tool for only one part of the communication chain &#8212; but not for the whole &#8230;</div>
</blockquote>
<div>This is of course difficult, but so is the publication of research in print or other mediums. Usually these forms of publication involve the work of other people, for which researchers pay in one way or another.  Perhaps it is because I happen to help manage a service, <a href="http://www.oucs.ox.ac.uk/infodev/">InfoDev</a>,  which would be more than happy to undertake paid work in this area for you and other external institutions, but I don&#8217;t see this as much as a hurdle.  If the research is worthwhile, then hopefully funding is available, and some of this could be budgeted for technical development.  However, that said, researchers often spend years learning ancient languages or obscure discipline-based technicalities, and arguably they should be able to learn some basic XSLT and HTML with a very small dedication of their time.  Whether they should and could do this is, of course, a personal decision, but these are just more tools in a toolbox that might also include knowledge of how to write complex statistical queries or how to collaborate using version control systems. But again, we&#8217;re happy to undertake work, especially TEI-related work, from any part of the digitization to publication, analysis and visualization aspects of research projects.</div>
<blockquote>
<div>Perhaps, a marketing campaign would help.</div>
</blockquote>
<div>This would perhaps help get more people involved in the TEI. We would want, I suggest, that anyone doing a humanities text project applying for funding should feel (or get the advice that) they should be using the TEI (or at least justifying why they are using some other open standard instead). I feel this is probably more in the mandate of the TEI Board than the TEI Technical Council, but would encourage SIGs and indeed individuals to undertake whatever outreach activities are feasible.</div>
<blockquote>
<div>about question 11 : it would be interesting to relate software/tools development and training/workshop. offering training sessions dedicated to one tool or category of tools, and looking at how people use tools IRL during the training sessions to get a better idea of need specifications&#8230; ?</div>
</blockquote>
<div>This would be interesting, though those who have been just trained in tools are likely to perceive different needs from those who use them on a daily basis. But I do wonder whether this should be a priority for the TEI Technical Council, who has its hands full maintaining, improving, and extending the Guidelines themselves.  We should encourage tool development by third parties, and facilitate this development where it is in our power.</div>
<blockquote>
<div>Please, please, please don&#8217;t spend time and money on building a TEI-wide repository. Instead, convince Google to recognize the TEI format so that one can easily do a web search for TEI texts. Then, get people to put their texts on the web. I think the building of publishing tools and education are very important, but that they shouldn&#8217;t be Council functions per se. Similarly, I think the interchange question is very, very important, but Council&#8217;s role in it should be limited. This is the kind of thing a SIG (or SIGs) should tackle, and Council should be involved in blessing/criticizing their output.</div>
</blockquote>
<div>Personally, I agree with you about building repositories. I feel there are more than enough people with a lot more experience in undertaking this kind of activity.  There already has been discussion and work with Google regard exporting from Google Books in TEI P5 XML format which are promising. I agree the community, potentially through SIGs can handle a lot of these issues. I worry about the idea of it &#8220;blessing/criticizing&#8221; the output of SIGs, rather than just being on hand to provide support and implement changes recommended by them.</div>
<blockquote>
<div>Creating and managing a content repository is vastly different from developing and maintaining markup guidelines, and would require a serious redirection of TEI-c&#8217;s resources. Let others who are already in the repo business (e.g., HathiTrust, OTA) take care of that.</div>
</blockquote>
<div>I would agree with this, and it is what I would recommend to the TEI.</div>
<blockquote>
<div>Thank you for undertaking this survey.</div>
</blockquote>
<p>You&#8217;re welcome, it was my pleasure. I&#8217;m always interested in getting a sense of where the TEI community agrees on certain issues.</p>
<pre><strong><abbr title="Question 13">
</abbr></strong></pre>
<h3><strong><abbr title="Question 13">13</abbr>. You may optionally include your email address so we can contact you if (and only if) we have any follow-up questions concerning your responses.</strong></h3>
<p>I&#8217;m certainly not going to provide these for spam-bots!</p>
<h2>Conclusion</h2>
<p>My recommendation to the TEI Council is going to be that we slowly start phasing out TEI P4 support. Closer to the end-of-support date (November 2012) we should move the TEI P4 materials to the TEI Vault but redirect links to there. I think this survey bears out my belief that the TEI Technical Council should focus on the maintenance and improvement of the Guidelines, and looking for ways to improve these in the future.</p>
<p><strong><br />
</strong></p>
<pre><span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size: x-small"><span style="line-height: 19px"><span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size: x-small">
</span></span></span></pre>
]]></content:encoded>
			<wfw:commentRss>http://blogs.oucs.ox.ac.uk/jamesc/2011/10/27/tei_p4_support/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>TEI Consortium and its Future</title>
		<link>http://blogs.oucs.ox.ac.uk/jamesc/2011/09/21/tei-consortium/</link>
		<comments>http://blogs.oucs.ox.ac.uk/jamesc/2011/09/21/tei-consortium/#comments</comments>
		<pubDate>Wed, 21 Sep 2011 14:12:03 +0000</pubDate>
		<dc:creator>James Cummings</dc:creator>
				<category><![CDATA[TEI]]></category>

		<guid isPermaLink="false">http://blogs.oucs.ox.ac.uk/jamesc/?p=208</guid>
		<description><![CDATA[John Unsworth, interim chair of the TEI Consortium (TEI-C) has asked those running for TEI Board or TEI Technical Council, and those who are remaining in place to answer some questions regarding the development of the TEI.  I&#8217;m already serving a &#8230; <a href="http://blogs.oucs.ox.ac.uk/jamesc/2011/09/21/tei-consortium/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>John Unsworth, interim chair of the TEI Consortium (TEI-C) has asked those running for TEI Board or TEI Technical Council, and those who are remaining in place to answer some questions regarding the development of the TEI.  I&#8217;m already serving a term through 2012 so not up for potential re-election this year. I&#8217;ve chosen to write my answers up as a blog post because I found it difficult adhere to John&#8217;s plea for brevity.</p>
<blockquote><p>1) Should the TEI cease to collect membership fees, and cease to pay for meetings, publications, services, etc.?</p></blockquote>
<p>I feel it would be difficult for the TEI Consortium to continue its work without collecting membership fees. However, I think the majority of this money should not be reserved for travel. The majority of it should be available for application in the same manner as we have done the SIG grants in the past. (However, this might be used for travel for a particular TEI Technical Council additional workgroup, or bursaries for the conference, or targeted tool development (&#8216;bounties&#8217;) for tools useful to the TEI-C&#8217;s mission, amongst many other things.) There should not necessarily be any limits on what could qualify for an application for funding. Not all revenues would need to be spent in a single year.</p>
<blockquote><p>2) Assuming paid membership continues. should institutional members have a choice between paying in cash and paying by supporting the travel of their employees to meetings, or committing time on salary to work on TEI problems?</p></blockquote>
<p>The cost of running meetings for the TEI Board or Technical Council should mostly be born by the institution and agreed to at time of nomination. (i.e. if your institution won&#8217;t commit to fund your attendance (travel and subsistence) at a couple meetings a year, then you should not necessarily be accepted as a candidate.) I realise this is unfair but so is participation in most standards-creating bodies, but there is nothing stopping significant participation by any member of the community (i.e. they don&#8217;t need to be on Board/Council to affect change).  It may be that public funds could be sought to further supplement this by the institution or individual. TEI-C money would be used for any overall expenses, such as the costs of room hire, or such things not covered by institutions. If an institutional member was in dire straits financially, but the participation of a person elected from that institution was deemed to be of such a benefit to the TEI-C, they could apply for support from the TEI-C. However, this should not be the norm. All Partner-level institutions should offer services as part of their partnership agreement in addition to the top-level membership fee. These partnership agreements should be made public on the TEI-C website. &#8216;Membership&#8217; at a lower non-Partner rate might be replaced solely by services.  There should be nothing stopping voluntary participation in TEI-C activities by motivated individuals who are not institutional members.</p>
<blockquote><p>3) Should the TEI have individual members (paying or not) who can vote to elect people to the board and/or council?</p></blockquote>
<p>All members at every single level, especially including individual subscribers should have a single vote.  Institutions become Partners to support the TEI Consortium and tend to view it as participation in a standardization body, I doubt many care strongly about their privileged position of having a vote at election time. One vote for one member (whether individual, Partner, or otherwise).</p>
<blockquote><p>4) Should the email discussions of the TEI Board be publicly accessible?</p></blockquote>
<p>Yes. The TEI Technical Council archives were made public partly because of my suggestion that they should be done so. See <a href="http://lists.village.virginia.edu/pipermail/tei-council/2006/005757.html">http://lists.village.virginia.edu/pipermail/tei-council/2006/005757.html</a> &#8230; in this post I assumed that TEI Board mailing list might contain details that would be detrimental if made public.  Having had reports back from institutional representatives on the mailing list I no longer believe that this is true for the majority of posts there. I would recommend that when something of an extremely confidential nature is discussed that this happen off the TEI Board mailing list, <strong>but that an edited summary of this discussion be posted back on the list for all to see</strong>. However such <em>in camera</em> discussions should be very unusual and justified before taking place.</p>
<blockquote><p>5) Should the Board and the Council be combined into a single body, with subsets of that group having the responsibilities now assigned to each separate group?</p></blockquote>
<p>I agree that the TEI Board and TEI Technical Council might seem a bit cumbersome. I&#8217;ve been on the TEI Technical Council since 2004 and have enjoyed that it is not in its remit to worry about the fiscal, marketing, and organizational aspects of the TEI-C. Although I think the TEI Board could do a better job in these areas, especially marketing, these are not my strengths.  If they were merged together I think it might distract from the technical work. If we then made sub-groups with responsibilities for Board-like activities and Technical Council-like activities, aren&#8217;t we just reinventing the Board and Technical Council?  If the activities and discussions of the TEI Board were conducted publicly (i.e. the mailing list archives were public), then I think that would be enough. The community could then lobby elected individuals if they wished to get their points of view heard.</p>
<blockquote><p>6) Assuming we continue to collect funds, we will still have limited resources.  Given that, in the next two years, which of the following should be the TEI&#8217;s highest priority? Pick only one:</p>
<p>a) providing services that make it easy for scholars to publish and use TEI texts online<br />
b) providing workshops, training, and other on-ramp services that help people understand why they might want to use TEI and how to begin to do so<br />
<span style="text-decoration: line-through"> d</span>c) encouraging the development of third-party tools for TEI users<br />
d) ensuring that large amounts of lightly but consistently encoded texts (e.g., TEI Tite) are generated and made publicly available, perhaps in a central repository or at least through some centrally coordinated portal<br />
e) developing a roadmap for P6 that positions the TEI in relation to other standards (HTML5, RDF, etc.)<br />
f) tackling hard problems not addressed in other encoding schemes, in order to maximize the expressive and interpretive power of TEI</p></blockquote>
<p>This is a difficult choice because so many of these are things that I feel strongly need to be encouraged.</p>
<p>a) is very vague and I feel it is not the role of the TEI-C to be providing lots of services, rather maintaining a standard.<br />
b) also sounds good, but we already have lots of people providing training (my own institution included) at cost-recovery basis. Some more basic guides might be beneficial.<br />
c) The TEI-C can encourage these through SIG grants and bounties where appropriate, but third-party tools should be developed by third parties.<br />
d) I&#8217;m highly resistant to the idea that any TEI users should even <strong>see</strong> TEI Tite documents at all! This schema is <strong>not TEI Conformant or Conformable</strong> by itself as it breaks the TEI Abstract Model in several ways. Tite is fine as a mass-digitization schema, but should be transformed instantly and internally to the project to a proper TEI file with a &lt;teiHeader&gt;. I have nothing against lots of sample TEI texts being made available, in TEI Lite or better a different slimmed down mostly structural encoding. However, I think that having these all in one place is unlikely, and distributed collections of archives (all linked to from <a href="http://wiki.tei-c.org/index.php/Samples">http://wiki.tei-c.org/index.php/Samples</a> or another location) or through some OAI-PMH or RDF aggregator is probably an easier start). Again, this should be done by the community not the TEI-C. There are no barriers to the community just doing this and I know the Oxford Text Archive has some plans in this area.<br />
f) Is a possibility, but the suggestions and developments for the TEI Guidelines should come from the community. However, the TEI Guidelines are not Guidelines of the Gaps handling just those things not done by other standards. It plays nicely with other standards where at all possible and developments should continue to improve it in this area.<br />
e) Which I&#8217;ve cunningly left to last is probably central to what the TEI-C or at least the TEI Technical Council should be doing. We already have a statement on the conditions for maintenance of P5 and developments of such things like P6 <a href="http://www.tei-c.org/Activities/Council/Working/tcw09.xml">http://www.tei-c.org/Activities/Council/Working/tcw09.xml</a> and I do not believe we have reached such a major change in technology or infrastructure to warrant TEI P6, yet. However, I agree that there are things we can do with the TEI Guidelines to help those seeking transformations to HTML5, RDF, and other newer formats and recommendations to be made in this area. I disagree entirely that these somehow replace the need for TEI.  A roadmap is a good idea, but a lot of the necessary changes can be done under the umbrella of TEI P5 and its intended deprecation mechanisms.</p>
<p>So, on balance, I plump for &#8216;e)&#8217;, however I think all the other ideas are beneficial things, with c) and f) being my second choices.</p>
<p>Overall, I do not think the TEI-C is horribly broken, and believe that the TEI has a good and useful role to play in the development of digital resources. The suggested revisions moving towards openness and transparency would be beneficial. I feel the problems people have had with the TEI Board stem from not knowing what is going on there (lack of transparency) and members of the Board acting as individuals rather than remembering that they are there are representatives of the community at large.</p>
<p>-James</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.oucs.ox.ac.uk/jamesc/2011/09/21/tei-consortium/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Digital Humanities 2011</title>
		<link>http://blogs.oucs.ox.ac.uk/jamesc/2011/07/01/digital-humanities-2011/</link>
		<comments>http://blogs.oucs.ox.ac.uk/jamesc/2011/07/01/digital-humanities-2011/#comments</comments>
		<pubDate>Fri, 01 Jul 2011 11:39:02 +0000</pubDate>
		<dc:creator>James Cummings</dc:creator>
				<category><![CDATA[Conference]]></category>

		<guid isPermaLink="false">http://blogs.oucs.ox.ac.uk/jamesc/?p=197</guid>
		<description><![CDATA[Digital Humanities 2011 My report from Digital Humanities 2011 is below. If anyone wants any more information about the various sessions I attended, I&#8217;m happy to try and dredge my memory for a recollection of my impressions. Otherwise the book of abstracts is available. Most &#8230; <a href="http://blogs.oucs.ox.ac.uk/jamesc/2011/07/01/digital-humanities-2011/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<h1><span style="font-weight: normal">Digital Humanities 2011</span></h1>
<p>My report from Digital Humanities 2011 is below. If anyone wants any more information about the various sessions I attended, I&#8217;m happy to try and dredge my memory for a recollection of my impressions. Otherwise <a href="https://dh2011.stanford.edu/wp-content/uploads/2011/05/DH2011_BookOfAbs.pdf">the book of abstracts is available</a>. Most of the interesting things were really in between sessions and in the evenings, in talking to people about possible future projects, advertising InfoDev services, etc.</p>
<h2><span style="font-weight: normal">Friday 17 June 2011</span></h2>
<p>Sebastian and I took an afternoon flight to SFO where we were attending the <a href="https://dh2011.stanford.edu/">Digital Humanities 2011 conference</a>. I was lucky enough to get a row to myself, but Sebastian kept to his assigned seat rather than join me and be tormented by my cackling at juvenile films. I watched four films, the only one of which I&#8217;d recommend is <a href="http://www.imdb.com/title/tt1440292/">Submarine</a> whose screenplay and direction was by Richard Ayoade. Sebastian&#8217;s estimate of c.250 is a bit off, there were about 375 registered participants with various other hangers-on according to the organisers.</p>
<h2><span style="font-weight: normal">Saturday 18 June 2011</span></h2>
<p>Sebastian and I woke early (thank you jetlag) to teach our Introductory TEI ODD workshop at 8:30am. Unfortunately, nothing on campus that serves anything which even vaguely resembles food opens until 8am on a Saturday. The course materials are at: <a href="http://tei.oucs.ox.ac.uk/Talks/2011-06-18-odd/">http://tei.oucs.ox.ac.uk/Talks/2011-06-18-odd/</a> and we had about 15 participants. We went perhaps a bit too fast, and talked too long, but most of them made it through the first exercise. Some had difficulty with the idea that we weren&#8217;t teaching the stated prerequisite of TEI and XML but the TEI&#8217;s customization language instead. It really would have been better to do it as a full day workshop.</p>
<p>Afterwards a Craig Bellamy and I drove (in a mustang he had rented) down to Santa Cruz and ate a burrito on the beach. It was better than the ones I get here in Oxford and was not dissimilar to the real thing. We also went to look at UC Santa Cruz where Craig had spent some undergraduate time, a truly bizarre campus. Craig is responsible for setting up the Australasian Association for Digital Humanities see <a href="http://www.craigbellamy.net/2011/05/31/australasian-association-for-digital-humanities-aadh/">http://www.craigbellamy.net/2011/05/31/australasian-association-for-digital-humanities-aadh/</a> and <a href="http://aa-dh.org/">http://aa-dh.org/</a> which is seeking to join ADHO (Alliance of Digital Humanities Organizations) alongside ACH, ALLC, and SDH-SEMI. Much of our conversation related to this topic and the AHDO Steering Committee meeting the next day. (Boy, don&#8217;t we know how to spoil a beach!) We returned to Stanford and met up with various other DH conference goers for &#8216;food&#8217; and &#8216;drink&#8217; in the local student&#8217;s union.</p>
<h2><span style="font-weight: normal">Sunday 19 June 2011</span></h2>
<p>I intended to go swimming this day, but the lane swimming wasn&#8217;t open until the afternoon, so instead I rented a bicycle. I purchased a variety of items to put in the huge fridge that was part of the full-sized kitchen (with stove, sink, dishwasher, microwave, etc.) that was in my room. Sadly the kitchen didn&#8217;t come with anything useful to, you know, cook or eat with. It didn&#8217;t come with anything at all. Since Sebastian also had a bicycle we cycled to the Stanford Shopping Centre, where we looked around at things we could possibly buy, had lunch, and eventually cycled back to the residences. The conference&#8217;s opening plenary was by David Rumsey <a href="http://www.davidrumsey.com/">http://www.davidrumsey.com/</a> talking about &#8220;Reading Historical Maps Digitally: How Spatial Technologies Can Enable Close, Distant and Dynamic Interpretations&#8221; but partly seemed to be demonstrating the proprietary Luna Browser (<a href="http://www.davidrumsey.com/view/luna">http://www.davidrumsey.com/view/luna</a>)(java servlet based image viewer) which I didn&#8217;t like at all. At the reception afterwards there was much pleasant conversation.</p>
<h2><span style="font-weight: normal">Monday 20 June 2011</span></h2>
<p>I attended a morning session consisting of the following papers:</p>
<ul>
<li>Maciej Eder &amp; Jan Rybicki &#8220;Do Birds of a Feather Really Flock Together, or How to Choose Test Samples for Authorship Attribution &#8220;</li>
<li> Jan Rybicki &#8220;Alma Cardell Curtin and Jeremiah Curtin: The Translator’s Wife’s Stylistic Fingerprint.&#8221;</li>
<li>David L. Hoover &#8220;The Tutor&#8217;s Story: A Case Study of Mixed Authorship&#8221;</li>
</ul>
<p>And then one with:</p>
<ul>
<li>Yves Marcoux, Michael Sperberg-McQueen, &amp; Claus Huitfeldt &#8221;Expressive power of markup languages and graph structures &#8220;</li>
<li>Gary F. Simons, Steven Bird, Christopher Hirt, Joshua Hou, &amp; Sven Pedersen &#8220;Mining language resources from institutional repositories&#8221;</li>
<li>Thomas Eckart, David Pansch, &amp; Marco Büchler &#8221;Integration of Distributed Text Resources by Using Schema Matching Techniques&#8221;</li>
</ul>
<p>Of these the one by Yves Marcoux on OO-TexMECS was the most interesting (though Eckart&#8217;s showed some promise). However, I fundamentally disagreed that breaking XML is necessary for recording the majority of the graph data-structures he was presenting. TEI-style basic fragmentation, or even basic stand-off linking seems to do the trick in 99% of cases. It is an interesting discussion for markup geeks interested in the theory behind markup languages, but solving a problem that I feel isn&#8217;t really a problem for the majority of work we do here.</p>
<p>After lunch I went to a bit of:</p>
<ul>
<li>Reinhild Barkey, Erhard Hinrichs, Christina Hoppermann, Thorsten Trippel, &amp; Claus Zinn &#8220;Trailblazing through Forests of Resources in Linguistics &#8220;</li>
<li>Michele Pasin &#8221; Browsing highly interconnected humanities databases through multi-result faceted browsers &#8220;</li>
<li>Alan Galey &#8220;Approaching the Coasts of Utopia: Visualization Strategies for Mapping Early Modern Paratexts&#8221;</li>
</ul>
<p>before nipping off to the location where the posters were to be displayed and put up my Wandering Jew&#8217;s Chronicle poster as well as Sebastian&#8217;s Claros poster both right in front of the doors where you walk in, ensuring maximum throughput of people to look at them. The poster session was quite busy, shortly before I took photos of all the posters, however, this is on the camera which later went missing. There was a reception that followed this, but I was so busy talking to people about the poster that I seemed to miss it. Luckily someone brought me a drink (and we arranged a tour of SLAC for the next day).</p>
<h2><span style="font-weight: normal">Tuesday 21 June 2011</span></h2>
<p>Sebastian woke up extra early to go on a punishing &#8216;fun run&#8217; up huge mountains, whereas I slept in. From 08:30 we interviewed a<br />
potential ePub and/or OpenData intern via skype). Since we&#8217;d missed the beginning of the sessions (and from the abstracts of them I didn&#8217;t feel cheated), while Sebastian went off to catch the end of the sessions, I cycled to the nearby B. Gerald Cantor&#8217;s Rodin Sculpture Park and looked at a bronze cast of Rodin&#8217;s &#8220;The Gates of Hell&#8221; see <a href="http://museum.stanford.edu/view/rodin__1985_86.html">http://museum.stanford.edu/view/rodin__1985_86.html</a></p>
<p>Afterwards I caught one of the next sessions, specifically the one of a panel discussing &#8220;The Interface of the Collection&#8221;<br />
consisting of: Geoffrey Rockwell, Stan Ruecker, Mihaela Ilovan, Daniel Sondheim, Milena Radzikowska, Peter Organisciak, &amp; Susan Brown.</p>
<p>Over lunch, instead of nattering away to people about visualization Mike Toth had arranged a visit to the <a href="http://slac.stanford.edu/">Stanford</a> <a href="http://slac.stanford.edu/">Linear Accelerator Complex</a>, <a href="http://yfrog.com/ke3m7tmj">http://yfrog.com/ke3m7tmj</a> now &#8216;SSRL&#8217;. He had done work here in xray fluorescence to uncover the <a href="http://www.archimedespalimpsest.org">archimedes palimpsest</a> and they wrote up a glowing press article about our visit. <a href="https://news.slac.stanford.edu/features/digital-humanities-experts-learn-how-ssrl-can-shed-light-past">https://news.slac.stanford.edu/features/digital-humanities-experts-learn-how-ssrl-can-shed-light-past</a></p>
<p>We can,indeed, use real science tools to help digital humanities.</p>
<p>After this I ate some lunch in the back of the following session:</p>
<ul>
<li>David Beavan &#8220;ComPair: Compare and Visualise the Usage of Language &#8220;</li>
<li>Trevor Muñoz, Virgil Varvel, Allen Renear, Kevin Trainor, &amp; Molly Dolan &#8220;Tasks vs. Roles: A Center Perspective on Data Curation Needs in the Humanities &#8220;</li>
<li>Deborah Anderson &#8220;Handling Glyph Variants: Issues and Developments &#8220;</li>
<li>Scott Weingart &amp; Jeana Jorgensen &#8220;Computational Analysis of Gender and the Body in European Fairy Tales &#8220;</li>
<li>Hiroyuki Akama, Maki Miyake, &amp; Jaeyoung Jung &#8220;Automatic Extraction of Hidden Keywords by Producing “Homophily” within Semantic Networks&#8221;</li>
</ul>
<p>Later we went to the Zampolli Prize Lecture in the Dinkelspiel Auditorium and listened to the winner, Chad Gaffield tell us<br />
about &#8220;Re-Imagining Scholarship in the Digital Age&#8221;. This was a very motivational session by the president of the SSHRC funding body. I wouldn&#8217;t have been surprised if he had got everyone up and singing praises, but the auditorium was far too hot for that kind of thing.</p>
<h2><span style="font-weight: normal">Wednesday 22 June 2011</span></h2>
<p>This morning I went to the panel on &#8220;Integrating Digital Papyrology&#8221; featuring Gabriel Bodard, Hugh Cayless, Ryan<br />
Baumann, Joshua Sosin, &amp; Raffaele Viglianti.</p>
<p>After a break I attended &#8220;The “#alt-ac” Track: Digital Humanists off the Straight and Narrow Path to Tenure&#8221; featuring Bethany Nowviskie, Julia Flanders, Tanya Clement, Doug Reside, Dot Porter, &amp; Eric Rochester . Partly I attended because I have an article (as the last word) in the open access book they were launching <a href="http://mediacommons.futureofthebook.org/alt-ac/">http://mediacommons.futureofthebook.org/alt-ac/</a>.</p>
<p>After lunch there was a panel on Funding Digital Humanities, with funders from the USA and Canada. There was not a UK, European, Australian, Japanese, or Mexican funder represented. Still, was good to hear what they said.</p>
<p>After this there was the closing plenary by JB Michel &amp; Erez Lieberman-Aiden who had worked with Google to produce the Google ngram viewer. The long &#8216;s&#8217; problem in OCR&#8217;ed data clearly visible by looking at &#8216;best,beft&#8217; from 1700 to the modern day in <a href="http://ngrams.googlelabs.com/">http://ngrams.googlelabs.com/</a>. (Something I tweeted about a couple days after its launch but using presumption vs prefumption.) Unlike Chad, who seemed to be celebrating what Digital Humanities had done, these two seemed intent on telling us quite obvious things that DH as a community should be doing&#8230; most of which I&#8217;m pretty sure we already are doing or striving to do. Because it was so hot during Chad&#8217;s talk on the way there I stopped to get a mango smoothie which made the talk more tolerable.</p>
<p>Following this there was a banquet at the Computer History Museum in Mountain View. The food and drink were so-so, the company was excellent, the museum was fairly usa-centric in its outlook.</p>
<p><span style="font-size: 20px">Thursday 23 June 2011</span></p>
<p>While most people went on organised tours to Silicon Valley or the Sonoma Wine Country, instead Craig Bellamy (with his mustang) and Peter Organisciak and I drove up Highway 1 stopping off for delicious mexican food, beaches, and crossing the golden gate bridge. In S.F. we walked around fisherman&#8217;s wharf and some other places, before returning to Stanford. There was</p>
<p>simultaneously a meeting on the curation of digital humanities data which I followed via twitter.</p>
<h2><span style="font-weight: normal">Friday 24 June 2011</span></h2>
<p>I was flying home in the evening, so accompanied by Raffaele Viglianti I went to S.F. on the train, where we met up with some<br />
other people, wandered up and down the hills of china town, had some dim sum, and eventually I caught a shared van to SFO to<br />
catch my flight. This time I got a seat in the much smaller &#8217;upper deck&#8217; of the plane, but still didn&#8217;t capitalise on it and watched several more films. Arrived back Saturday midday horribly jetlagged.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.oucs.ox.ac.uk/jamesc/2011/07/01/digital-humanities-2011/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>grouping by group-adjacent=&#8221;boolean(self::lb)&#8221;</title>
		<link>http://blogs.oucs.ox.ac.uk/jamesc/2011/05/24/grouping_by_group-adjacentbooleanselflb/</link>
		<comments>http://blogs.oucs.ox.ac.uk/jamesc/2011/05/24/grouping_by_group-adjacentbooleanselflb/#comments</comments>
		<pubDate>Tue, 24 May 2011 10:14:29 +0000</pubDate>
		<dc:creator>James Cummings</dc:creator>
				<category><![CDATA[TEI]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[XML]]></category>
		<category><![CDATA[XSLT]]></category>

		<guid isPermaLink="false">http://blogs.oucs.ox.ac.uk/jamesc/?p=180</guid>
		<description><![CDATA[A project I was doing some work for had some input that looked like: &#60;?xml version=&#34;1.0&#34; encoding=&#34;UTF-8&#34;?&#62; &#60;TEI xmlns:tei=&#34;http://www.tei-c.org/ns/1.0&#34; xmlns=&#34;http://www.tei-c.org/ns/1.0&#34;&#62; &#60;teiHeader xmlns:xi=&#34;http://www.w3.org/2001/XInclude&#34; type=&#34;text&#34;&#62; &#60;fileDesc&#62; &#60;titleStmt&#62; &#60;title&#62;A sample file&#60;/title&#62; &#60;/titleStmt&#62; &#60;publicationStmt&#62; &#60;distributor&#62;InfoDev&#60;/distributor&#62; &#60;/publicationStmt&#62; &#60;sourceDesc&#62; &#60;p&#62;VSARPJ project&#60;/p&#62; &#60;/sourceDesc&#62; &#60;/fileDesc&#62; &#60;profileDesc&#62; &#60;creation&#62; &#60;date/&#62; &#8230; <a href="http://blogs.oucs.ox.ac.uk/jamesc/2011/05/24/grouping_by_group-adjacentbooleanselflb/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>A project I was doing some work for had some input that looked like: </p>
<pre class="brush: xml;">
&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;TEI xmlns:tei=&quot;http://www.tei-c.org/ns/1.0&quot; xmlns=&quot;http://www.tei-c.org/ns/1.0&quot;&gt;
&lt;teiHeader xmlns:xi=&quot;http://www.w3.org/2001/XInclude&quot; type=&quot;text&quot;&gt;
&lt;fileDesc&gt;
   &lt;titleStmt&gt;
      &lt;title&gt;A sample file&lt;/title&gt;
   &lt;/titleStmt&gt;
   &lt;publicationStmt&gt;
      &lt;distributor&gt;InfoDev&lt;/distributor&gt;
   &lt;/publicationStmt&gt;
   &lt;sourceDesc&gt;
      &lt;p&gt;VSARPJ project&lt;/p&gt;
   &lt;/sourceDesc&gt;
&lt;/fileDesc&gt;
&lt;profileDesc&gt;
   &lt;creation&gt;
      &lt;date/&gt;
   &lt;/creation&gt;
   &lt;langUsage&gt;
      &lt;language ident=&quot;ojp&quot;&gt;Old Japanese&lt;/language&gt;
   &lt;/langUsage&gt;
   &lt;textClass&gt;
      &lt;catRef target=&quot;#bussoku&quot;/&gt;
   &lt;/textClass&gt;
&lt;/profileDesc&gt;
&lt;encodingDesc&gt;
   &lt;samplingDecl&gt;
      &lt;p&gt;This text was transcribed phonemically and edited to parallel the content from the
         corresponding item in the &lt;title&gt;Nihon koten bungaku taikei&lt;/title&gt; version of the
            &lt;title&gt;Man'yôshû&lt;/title&gt;, &lt;ref&gt;Man'yôshû I&lt;/ref&gt;. &lt;/p&gt;
   &lt;/samplingDecl&gt;
&lt;/encodingDesc&gt;
&lt;/teiHeader&gt;
&lt;text&gt;
&lt;body xml:id=&quot;BS.1&quot;&gt;
   &lt;div&gt;
      &lt;ab type=&quot;original&quot; xml:lang=&quot;ojp&quot;&gt; 美阿止都久留 &lt;lb xml:id=&quot;BS.1-orig_1&quot;
            corresp=&quot;#BS.1-trans_1&quot;/&gt; 伊志乃比鼻伎波 &lt;lb xml:id=&quot;BS.1-orig_2&quot; corresp=&quot;#BS.1-trans_2&quot;
         /&gt; 阿米爾伊多利 &lt;lb xml:id=&quot;BS.1-orig_3&quot; corresp=&quot;#BS.1-trans_3&quot;/&gt; 都知佐閇由須礼 &lt;lb
            xml:id=&quot;BS.1-orig_4&quot; corresp=&quot;#BS.1-trans_4&quot;/&gt; 知知波波賀多米爾 &lt;lb xml:id=&quot;BS.1-orig_5&quot;
            corresp=&quot;#BS.1-trans_5&quot;/&gt; 毛呂比止乃多米爾 &lt;/ab&gt;
      &lt;ab type=&quot;transliteration&quot; xml:lang=&quot;ojp-Latn&quot;&gt;
         &lt;s&gt;
            &lt;phr&gt;
               &lt;phr&gt;
                  &lt;cl&gt;
                     &lt;phr type=&quot;arg&quot;&gt;
                        &lt;w&gt;
                           &lt;m type=&quot;prefix&quot;&gt;
                              &lt;c type=&quot;phon&quot;&gt;mi&lt;/c&gt;
                           &lt;/m&gt;
                           &lt;w&gt;
                              &lt;c type=&quot;phon&quot;&gt;ato&lt;/c&gt;
                           &lt;/w&gt;
                        &lt;/w&gt;
                     &lt;/phr&gt;
                     &lt;w type=&quot;verb&quot; function=&quot;adnconc&quot; ana=&quot;#L031144&quot;&gt;
                        &lt;c type=&quot;phon&quot;&gt;tukuru&lt;/c&gt;
                     &lt;/w&gt;
                  &lt;/cl&gt;
                  &lt;w type=&quot;verb&quot; function=&quot;adnconc&quot; ana=&quot;#L031144&quot;&gt;
                     &lt;c type=&quot;phon&quot;&gt;tukuru&lt;/c&gt;
                  &lt;/w&gt;
                  &lt;lb xml:id=&quot;BS.1-trans_1&quot; corresp=&quot;#BS.1-orig_1&quot;/&gt;
                  &lt;w&gt;
                     &lt;c type=&quot;phon&quot;&gt;isi&lt;/c&gt;
                  &lt;/w&gt;
                  &lt;w type=&quot;particle&quot; subtype=&quot;case&quot; function=&quot;gen&quot; ana=&quot;#L000520&quot;&gt;
                     &lt;c type=&quot;phon&quot;&gt;no&lt;/c&gt;
                  &lt;/w&gt;
               &lt;/phr&gt;
               &lt;w&gt;
                  &lt;c type=&quot;phon&quot;&gt;pibiki&lt;/c&gt;
               &lt;/w&gt;
               &lt;w type=&quot;particle&quot; subtype=&quot;top&quot; ana=&quot;#L000522&quot;&gt;
                  &lt;c type=&quot;phon&quot;&gt;pa&lt;/c&gt;
               &lt;/w&gt;
            &lt;/phr&gt;
            &lt;lb xml:id=&quot;BS.1-trans_2&quot; corresp=&quot;#BS.1-orig_2&quot;/&gt;
            &lt;cl&gt;
               &lt;phr&gt;
                  &lt;w&gt;
                     &lt;c type=&quot;phon&quot;&gt;ame&lt;/c&gt;
                  &lt;/w&gt;
                  &lt;w type=&quot;particle&quot; subtype=&quot;case&quot; function=&quot;dat&quot; ana=&quot;#L000519&quot;&gt;
                     &lt;c type=&quot;phon&quot;&gt;ni&lt;/c&gt;
                  &lt;/w&gt;
               &lt;/phr&gt;
               &lt;w type=&quot;verb&quot; function=&quot;infinitive&quot; ana=&quot;#L030170&quot;&gt;
                  &lt;c type=&quot;phon&quot;&gt;itari&lt;/c&gt;
               &lt;/w&gt;
            &lt;/cl&gt;
            &lt;lb xml:id=&quot;BS.1-trans_3&quot; corresp=&quot;#BS.1-orig_3&quot;/&gt;
&lt;!-- etc --&gt;
         &lt;/s&gt;
      &lt;/ab&gt;
   &lt;/div&gt;
&lt;/body&gt;
&lt;/text&gt;
&lt;/TEI&gt;
</pre>
<p>What they wanted as output was a table-layout (icky) that aligned two nested tables of the original and the transliteration like:</p>
<pre class="brush: xml;">
&lt;table&gt;
   &lt;tr&gt;
      &lt;td&gt;
         &lt;table&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;origLine&quot;&gt;美阿止都久留&lt;/span&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;origLine&quot;&gt;伊志乃比鼻伎波&lt;/span&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;origLine&quot;&gt;阿米爾伊多利&lt;/span&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;origLine&quot;&gt;都知佐閇由須礼&lt;/span&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;origLine&quot;&gt;知知波波賀多米爾&lt;/span&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;origLine&quot;&gt;毛呂比止乃多米爾&lt;/span&gt;&lt;/td&gt;
            &lt;/tr&gt;
         &lt;/table&gt;
      &lt;/td&gt;
      &lt;td&gt;
         &lt;table&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;w&quot;&gt;miato&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;tukuru&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;tukuru&lt;/span&gt;
               &lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;w&quot;&gt;isi&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;no&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;pibiki&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;pa&lt;/span&gt;
               &lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;w&quot;&gt;ame&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;ni&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;itari&lt;/span&gt;
               &lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;w&quot;&gt;tuti&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;sape&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;yusure&lt;/span&gt;
               &lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;w&quot;&gt;titipapa&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;ga&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;tame&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;ni&lt;/span&gt;
               &lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;/td&gt;
            &lt;/tr&gt;
            &lt;tr&gt;
               &lt;td&gt;&lt;span class=&quot;w&quot;&gt;moropito&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;no&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;tame&lt;/span&gt;
                  &lt;span class=&quot;w&quot;&gt;ni&lt;/span&gt;
               &lt;/td&gt;
            &lt;/tr&gt;
         &lt;/table&gt;
      &lt;/td&gt;
      &lt;td&gt;BS.1&lt;/td&gt;
   &lt;/tr&gt;
&lt;/table&gt;
</pre>
<p>If we ignore the icky aspect of using tables for layout and alignment purposes, then the solution has something interesting to learn from. This is, at heart, a grouping problem.  The solution I came up with was:</p>
<pre class="brush: xml;">
&lt;?xml version='1.0'?&gt;
&lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; xpath-default-namespace=&quot;http://www.tei-c.org/ns/1.0&quot; version=&quot;2.0&quot;&gt;
    &lt;xsl:template match=&quot;TEI&quot;&gt;
        &lt;html&gt;
            &lt;head&gt;
                &lt;title&gt;test corpus&lt;/title&gt;
            &lt;/head&gt;
            &lt;body&gt;
                &lt;xsl:apply-templates/&gt;
            &lt;/body&gt;
        &lt;/html&gt;
    &lt;/xsl:template&gt;

    &lt;!-- You can put things you want to do nothing to all in one template --&gt;
    &lt;xsl:template match=&quot;teiHeader | note | entry | list&quot;/&gt;

    &lt;!-- Or similarly things you want to just have the tags vanish from.  w is here and elsewhere, hence priority. --&gt;
    &lt;xsl:template match=&quot; choice | m |w | s |phr|cl &quot; priority=&quot;-1&quot;&gt;&lt;xsl:apply-templates/&gt;&lt;/xsl:template&gt;

    &lt;!-- If you are using tables for layout purposes (icky) then you don't need to change lb's to BRs. --&gt;
    &lt;!--
    &lt;xsl:template match=&quot;lb&quot;&gt;
        &lt;br/&gt;
     &lt;/xsl:template&gt;--&gt;

    &lt;xsl:template match=&quot;body&quot;&gt;
        &lt;table&gt;
            &lt;tr&gt;
                &lt;td&gt;
                    &lt;!-- Nesting an icky table for each cell, so you can get one tr per line. --&gt;
                   &lt;table&gt;
                       &lt;xsl:apply-templates select=&quot;descendant::ab[@type='original']&quot;/&gt;
                   &lt;/table&gt; &lt;/td&gt;
                &lt;td&gt;
                    &lt;!-- Nesting an icky table for each cell, so you can get one tr per line. --&gt;
                    &lt;table&gt;&lt;xsl:apply-templates select=&quot;descendant::ab[@type='transliteration']&quot;/&gt;&lt;/table&gt;
                 &lt;/td&gt;
                &lt;td&gt;
                    &lt;xsl:value-of select=&quot;@xml:id&quot;/&gt;
                &lt;/td&gt;
            &lt;/tr&gt;
        &lt;/table&gt;
    &lt;/xsl:template&gt;

    &lt;!-- Not really necessary but in case you wanted to be able to do something with the original lines, wrap an element around them. --&gt;
    &lt;xsl:template match=&quot;ab[@type='original']//text()&quot;&gt;&lt;span class=&quot;origLine&quot;&gt;&lt;xsl:value-of select=&quot;normalize-space(.)&quot;/&gt;&lt;/span&gt;&lt;/xsl:template&gt;

    &lt;!-- For original things group by any child nodes or text, and create the groups adjacent to whether there is a linebreak or not. --&gt;
    &lt;xsl:template match=&quot;ab[@type='original']&quot;&gt;
        &lt;xsl:for-each-group select=&quot;child::node()| child::text()&quot;  group-adjacent=&quot;boolean(self::lb)&quot;&gt;
        &lt;tr&gt;&lt;td&gt;&lt;xsl:apply-templates select=&quot;current-group()&quot;/&gt;&lt;/td&gt;&lt;/tr&gt;
        &lt;/xsl:for-each-group&gt;
        &lt;/xsl:template&gt;

    &lt;!-- For transliterations first flatten hierarchy (you could do this a variety of ways), by copying just the top w elements and linebreaks, and for each of these group adjacent to the line breaks. --&gt;
    &lt;xsl:template match=&quot;ab[@type='transliteration']&quot;&gt;
        &lt;xsl:variable name=&quot;test&quot;&gt;&lt;xsl:copy-of select=&quot;.//w[not(ancestor::w)] | .//lb&quot;/&gt;&lt;/xsl:variable&gt;
        &lt;xsl:for-each-group select=&quot;$test/*&quot; group-adjacent=&quot;boolean(self::lb)&quot;&gt;
                    &lt;tr&gt;
                        &lt;td&gt;&lt;xsl:apply-templates select=&quot;current-group()&quot;/&gt;&lt;/td&gt;
                    &lt;/tr&gt;

        &lt;/xsl:for-each-group&gt;
    &lt;/xsl:template&gt;

    &lt;!-- Since we have w's nested inside w's when we have one of the top ones wrap and element around it, and then take the value stripping out any spaces. (other ways to do this as well). --&gt;
    &lt;xsl:template match=&quot;w[not(ancestor::w)]&quot;&gt;&lt;span class=&quot;w&quot;&gt;&lt;xsl:value-of select=&quot;translate(normalize-space(.), ' ', '')&quot;/&gt;&lt;/span&gt;&lt;xsl:text&gt; &lt;/xsl:text&gt;&lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;
</pre>
<p>Most of this is pretty straightforward, and I&#8217;ve included comments in the XSLT to help anyone wondering why I&#8217;m doing something. But if we look at just one bit of it: </p>
<pre class="brush: xml;">
  &lt;!-- For original things group by any child nodes or text, and create the groups adjacent to whether there is a linebreak or not. --&gt;
    &lt;xsl:template match=&quot;ab[@type='original']&quot;&gt;
        &lt;xsl:for-each-group select=&quot;child::node()| child::text()&quot;  group-adjacent=&quot;boolean(self::lb)&quot;&gt;
        &lt;tr&gt;&lt;td&gt;&lt;xsl:apply-templates select=&quot;current-group()&quot;/&gt;&lt;/td&gt;&lt;/tr&gt;
        &lt;/xsl:for-each-group&gt;
     &lt;/xsl:template&gt;
 </pre>
<p>The reason this is interesting is using @group-adjacent=&#8221;boolean(self::lb)&#8221;.  I&#8217;m using the truth or falseness of whether the current node is a line-break element as a test to group the adjacent nodes.  In XSLT2 there are basically two types of grouping conditions, patterns and expressions. @group-starting-with and @group-ending-with require their values to be a pattern, but @group-by and @group-adjacent accept any XPath expression. This means with those two you can have a bit more fun!  In these the condition is being applied to each item in the population you are grouping in order to calculate grouping keys.  In those accepting patterns, the condition must match specific nodes in this population that will either lead or terminate a newly-created group. This is an important distinction to keep in mind and means that with group-adjacent you can use things that calculate the key to be matched rather than being that key.  So in this case we use boolean(self::lb) to test whether the current node being matched is a  or not.  If it is, then the grouping condition is true so it creates the group based on its siblings. </p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.oucs.ox.ac.uk/jamesc/2011/05/24/grouping_by_group-adjacentbooleanselflb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ubuntu Twinview Maximizing Windows problem</title>
		<link>http://blogs.oucs.ox.ac.uk/jamesc/2010/10/21/ubuntu-twinview-maximizing-windows-problem/</link>
		<comments>http://blogs.oucs.ox.ac.uk/jamesc/2010/10/21/ubuntu-twinview-maximizing-windows-problem/#comments</comments>
		<pubDate>Thu, 21 Oct 2010 10:10:29 +0000</pubDate>
		<dc:creator>James Cummings</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blogs.oucs.ox.ac.uk/jamesc/?p=177</guid>
		<description><![CDATA[This is more of a note-to-self. I had a problem in my recent upgrade to the latest Ubuntu in that my two monitors, when set to &#8216;twinview&#8217; meant that the panels and task bars, and maximized windows spanned both monitors. &#8230; <a href="http://blogs.oucs.ox.ac.uk/jamesc/2010/10/21/ubuntu-twinview-maximizing-windows-problem/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>This is more of a note-to-self.  I had a problem in my recent upgrade to the latest Ubuntu in that my two monitors, when set to &#8216;twinview&#8217; meant that the panels and task bars, and maximized windows spanned both monitors.  What you really want is for these to be able to be moved from one monitor to the other, but when you maximize them they stay maximized in only one monitor.</p>
<p>The solution that I guessed might work, and it turned out did, was to comment out the &#8216;metamodes&#8217; option in the Screen section of my xorg.conf.  I.e.: </p>
<blockquote><p>
<code><br />
Section "Screen"<br />
    Identifier     "Screen0"<br />
    Device         "Device0"<br />
    Monitor        "Monitor0"<br />
    DefaultDepth    24<br />
    Option         "TwinView" "1"<br />
    Option         "TwinViewXineramaInfoOrder" "CRT-0"<br />
    #Option         "metamodes" "CRT-0: 1280x1024 +0+0, CRT-1: 1280x1024 +1280+0"<br />
    SubSection     "Display"<br />
        Depth       24<br />
    EndSubSection<br />
EndSection<br />
</code>
</p></blockquote>
<p>That sorted out the problem as soon as I logged back in again.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.oucs.ox.ac.uk/jamesc/2010/10/21/ubuntu-twinview-maximizing-windows-problem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thunderbird Calendar Automatic Export</title>
		<link>http://blogs.oucs.ox.ac.uk/jamesc/2010/10/21/thunderbird-calendar-automatic-export/</link>
		<comments>http://blogs.oucs.ox.ac.uk/jamesc/2010/10/21/thunderbird-calendar-automatic-export/#comments</comments>
		<pubDate>Thu, 21 Oct 2010 10:06:02 +0000</pubDate>
		<dc:creator>James Cummings</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blogs.oucs.ox.ac.uk/jamesc/?p=174</guid>
		<description><![CDATA[Previously I wrote about thunderbird, davmail, exchange and exporting to google calendar and my system was setup and working fine. Then I upgraded (full-wipe and install) to the latest Ubuntu operating system and I had to set things up again. &#8230; <a href="http://blogs.oucs.ox.ac.uk/jamesc/2010/10/21/thunderbird-calendar-automatic-export/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://blogs.oucs.ox.ac.uk/jamesc/2009/12/19/tb-l-nexus-export-to-gcal/">Previously I wrote about thunderbird, davmail, exchange and exporting to google calendar</a> and my system was setup and working fine.  Then I upgraded (full-wipe and install) to the latest Ubuntu operating system and I had to set things up again. Part of the problem was that the thunderbird <a href="https://addons.mozilla.org/en-US/sunbird/addon/3740/">Automatic Export</a> add-on wouldn&#8217;t work with the new version of thunderbird.  While I know sometimes changes of software mean that the plugin will no longer function, I didn&#8217;t think this might be a problem with Automatic Export&#8230; I mean all it does is take the calendar you&#8217;ve set and export it which hopefully isn&#8217;t too reliant on the way the program itself works.  Hopefully.</p>
<p>It turned out that if I unzipped the thunderbird plugin package available from <a href="https://addons.mozilla.org/en-US/sunbird/addon/3740/">Automatic Export on Mozilla add-ons site</a> then I was able to edit the install.rdf file which tells thunderbird about the package.  When I did I found that it had a em:maxVersion attribute and all I did was change that to be far past the current version. (Note: there were two of these, I changed both since I wasn&#8217;t sure which applied to what.)  Zipping the file back up again and renaming to .xpi was all that was needed for a successful install.</p>
<p>Everything now working again perfectly.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.oucs.ox.ac.uk/jamesc/2010/10/21/thunderbird-calendar-automatic-export/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Teaching in Helsinki</title>
		<link>http://blogs.oucs.ox.ac.uk/jamesc/2010/10/15/teaching-in-helsinki/</link>
		<comments>http://blogs.oucs.ox.ac.uk/jamesc/2010/10/15/teaching-in-helsinki/#comments</comments>
		<pubDate>Fri, 15 Oct 2010 15:11:50 +0000</pubDate>
		<dc:creator>James Cummings</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blogs.oucs.ox.ac.uk/jamesc/?p=171</guid>
		<description><![CDATA[I was recently invited to Helsinki by Varieng to teach a workshop on TEI XML, and specifically on TEI XML concentrating on transcription. The workshop slides and materials are at http://tei.oucs.ox.ac.uk/Oxford/2010-10-helsinki/. Though these were largely based on the TEI Summer &#8230; <a href="http://blogs.oucs.ox.ac.uk/jamesc/2010/10/15/teaching-in-helsinki/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I was recently invited to Helsinki by <a href="http://www.helsinki.fi/varieng/">Varieng</a> to teach a workshop on TEI XML, and specifically on TEI XML concentrating on transcription.  The workshop slides and materials are at <a href="http://tei.oucs.ox.ac.uk/Oxford/2010-10-helsinki/">http://tei.oucs.ox.ac.uk/Oxford/2010-10-helsinki/</a>. Though these were largely based on the <a href="http://tei.oucs.ox.ac.uk/Oxford/2010-07-oxford/">TEI Summer School 2010</a> that we taught earlier in the year.  We may hopefully be partnering with Varieng to convert the <a href="http://ota.oucs.ox.ac.uk/headers/1477.xml">Helsinki Corpus</a> to <a href="http://www.tei-c.org/">TEI P5 XML</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.oucs.ox.ac.uk/jamesc/2010/10/15/teaching-in-helsinki/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>simple dynamic transformation of xml with htaccess, php, and xslt</title>
		<link>http://blogs.oucs.ox.ac.uk/jamesc/2010/08/19/simple-dynamic-transformation-of-xml-with-htaccess-php-and-xslt/</link>
		<comments>http://blogs.oucs.ox.ac.uk/jamesc/2010/08/19/simple-dynamic-transformation-of-xml-with-htaccess-php-and-xslt/#comments</comments>
		<pubDate>Thu, 19 Aug 2010 17:30:14 +0000</pubDate>
		<dc:creator>James Cummings</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blogs.oucs.ox.ac.uk/jamesc/?p=154</guid>
		<description><![CDATA[I often transform from TEI XML to XHTML as part of projects, but in some instances it is more difficult to manage using things like the eXist XML Database or Apache Cocoon, or even AxKit. This is because the hosting &#8230; <a href="http://blogs.oucs.ox.ac.uk/jamesc/2010/08/19/simple-dynamic-transformation-of-xml-with-htaccess-php-and-xslt/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I often transform from TEI XML to XHTML as part of projects, but in some instances it is more difficult to manage using things like the eXist XML Database or Apache Cocoon, or even AxKit.  This is because the hosting arrangement means that only a limited number of technologies are available.</p>
<p>In most cases these days a linux-based server will have Apache&#8217;s http server installed, and hopefully the Apache ReWrite module installed.  In addition most hosting, even shared hosting, has PHP installed with  libxml for XSL processing.  Sadly, this only copes with XSLT1 not XSLT2.</p>
<p>However, one way to use this is to have one&#8217;s .htaccess file rewrite incoming URLs to run an xml2html.php conversion.  </p>
<p>Basic preceding stuff:</p>
<pre class="brush: plain;">
#Turn on Rewriting
RewriteEngine On
RewriteBase /
# Redirect any svn requests
RewriteRule ^.svn/(.*)$ http://subversion.tigris.org [R]
# utf-8 please
AddDefaultCharset UTF-8
# change directory index to index.xml as default
DirectoryIndex index.xml index.php index.html index.shtml
#ErrorDocuments
ErrorDocument 404 /unavailable.html
ErrorDocument 403 /forbidden.html
</pre>
<p>Here we start by turning the RewriteEngine on and setting the RewriteBase to the root of the domain. I&#8217;ve also got a RewriteRule that takes any requests for stuff in subversion directories and redirects it to the subversion site instead.  (Though actually I&#8217;m thinking of having that just 404 or 403 instead.) After that we set the default character set to UTF-8 and change the default directory index file names. and specify some error documents for 404s and 403s. (These are of course actually unavailable.xml and forbidden.xml, and are transformed by the rule further down.)</p>
<p>After this comes the bit where the rewriting of requests for HTML files get turned into parameters on a PHP script:</p>
<pre class="brush: plain;">
# If I ask for .xhtml then give me xml2html
RewriteRule ^(.*).xhtml$ /scripts/xml2html.php?xml=../$1.xml&amp;xsl=site.xsl&amp;%{QUERY_STRING} [L]
# If I have asked for .html then if the .html file exists, then give it.
RewriteRule   ^(.*)\.html$              $1      [C,E=WasHTML:yes]
RewriteCond   %{REQUEST_FILENAME}.html -f
RewriteRule   ^(.*)$ $1.html [L]
# else provide XML dynamically with xml2html.php
RewriteCond   %{ENV:WasHTML}            ^yes$
RewriteCond   %{REQUEST_FILENAME}.xml -f
RewriteRule ^(.*)$ /scripts/xml2html.php?xml=../$1.xml&amp;xsl=site.xsl&amp;%{QUERY_STRING} [L]
</pre>
<p>The first of these says that when I ask for any url on the site ended in .xhtml then take an XML file named the same thing and transform it using the xml2html.php script and the site.xsl stylesheet both in the /scripts directory. This is just for me, so that I can force it to run the transformation if a foo.xml and foo.html exist in the same directory.  </p>
<p>After this the next RewriteRule matches anything on the site that is asked for that ends in .html and takes the first bit of this (the path and filename). Simultaneously it uses &#8216;C&#8217; to chain this with the next rule and &#8216;E&#8217; to set an environmental variable &#8216;WasHTML&#8217; to be &#8216;yes&#8217;. Then there is a Rewrite Condition testing if this filename with a .html extension exists. If so, it rewrites this to be that filename.html and ends.  If not, it tests whether the environmental variable WasHTML is set to yes (because remember we&#8217;ve taken off the extension), and whether the filename we&#8217;ve asked for ending in .xml exists.  If so, then it runs the script giving the filename with .xml as the xml parameter and in this case site.xsl (in the same scripts directory) as the xsl.</p>
<p>That .htaccess file as a whole looks like:</p>
<pre class="brush: plain;">
#Turn on Rewriting
RewriteEngine On
RewriteBase /
# Redirect any svn requests
RewriteRule ^.svn/(.*)$ http://subversion.tigris.org [R]
# utf-8 please
AddDefaultCharset UTF-8
# change directory index to index.xml as default
DirectoryIndex index.xml index.php index.html index.shtml
#ErrorDocuments
ErrorDocument 404 /unavailable.html
ErrorDocument 403 /forbidden.html
# If I ask for .xhtml then give me xml2html
RewriteRule ^(.*).xhtml$ /scripts/xml2html.php?xml=../$1.xml&amp;xsl=site.xsl&amp;%{QUERY_STRING} [L]
# If I have asked for .html then if the .html file exists, then give it.
RewriteRule   ^(.*)\.html$              $1      [C,E=WasHTML:yes]
RewriteCond   %{REQUEST_FILENAME}.html -f
RewriteRule   ^(.*)$ $1.html [L]
# else provide XML dynamically with xml2html.php
RewriteCond   %{ENV:WasHTML}            ^yes$
RewriteCond   %{REQUEST_FILENAME}.xml -f
RewriteRule ^(.*)$ /scripts/xml2html.php?xml=../$1.xml&amp;xsl=site.xsl&amp;%{QUERY_STRING} [L]
</pre>
<p>The PHP script this is using (which I borrowed from a colleague) uses the http://www.php.net/manual/en/book.xsl.php libxml based XSLT processing in PHP.  It is fairly short and consists of:</p>
<pre class="brush: plain;">
&lt;script language=&quot;php&quot;&gt;
#Basic check for directory/site traversal
if(preg_match('/\.\.\/\.\./',$_REQUEST['xml'])) { die(&quot;invalid input&quot;); }
if(preg_match('/http/',$_REQUEST['xml'])) { die(&quot;invalid input&quot;); }
if(preg_match('/http/',$_REQUEST['xsl'])) { die(&quot;invalid input&quot;); }
if(preg_match('/\.\.\//',$_REQUEST['xsl'])) { die(&quot;invalid input&quot;); }
#load xsl document into XsltProcessor
  $xp = new XsltProcessor();
  $xsl = new DomDocument;
  $xsl-&gt;load($_REQUEST['xsl']);
  $xp-&gt;importStylesheet($xsl);
#load xml document
  $xp-&gt;setParameter( null, 'xml', $_REQUEST['xml']);
  $xml_doc = new DomDocument;
  $xml_doc-&gt;load($_REQUEST['xml']);
#Process any xincludes
  $xml_doc-&gt;xinclude();
#Transform the XML with the XSL or put out error
  if ($html = $xp-&gt;transformToXML($xml_doc)) {
      echo $html;
  } else {
      trigger_error('XSL transformation failed.', E_USER_ERROR);
  }
&lt;/script&gt;
</pre>
<p>The first bit of this is just a security precaution against directory (or site) traversal which rejects anything that has &#8216;../..&#8217; in it or &#8216;http&#8217;.  I&#8217;m sure there are a lot better ways to do this, but just checking the xml and xsl parameters seemed the easiest. I could have made a function and then passed it to each of them, or had the regex look for either of these two things, but I think it all works out the same and doesn&#8217;t seem to have much of a speed implication. Then we start a new XsltProcessor(), and a new xsl DomDocument, we load in the xsl file given in the xsl parameter, and also pass to this the parameter &#8216;xml&#8217; so that we can use this in our XSLT if we want.  Then we start a new xml_doc DomDocument and load in the requested XML file, and we do any XIncludes in that XML file.  We then transform the XML doc to HTML with transformToXML otherwise trigger and error and put that out.</p>
<p>This is a fairly lightweight way to transform XML to HTML on the fly using the technologies (PHP and .htaccess) that most hosting solutions provide. I&#8217;m using something like this on one of my personal sites and it is in use in a slightly different form in a number of work sites.</p>
<p>Hope it is useful to someone.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.oucs.ox.ac.uk/jamesc/2010/08/19/simple-dynamic-transformation-of-xml-with-htaccess-php-and-xslt/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

