<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Data.Random &#187; blackh</title>
	<atom:link href="http://random.axman6.com/blog/?feed=rss2&#038;author=3" rel="self" type="application/rss+xml" />
	<link>http://random.axman6.com/blog</link>
	<description>Random mutterings about life and Haskell.</description>
	<lastBuildDate>Thu, 22 Jul 2010 11:44:18 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Co-routines in Haskell</title>
		<link>http://random.axman6.com/blog/?p=231</link>
		<comments>http://random.axman6.com/blog/?p=231#comments</comments>
		<pubDate>Wed, 21 Jul 2010 13:49:36 +0000</pubDate>
		<dc:creator>blackh</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://random.axman6.com/blog/?p=231</guid>
		<description><![CDATA[It is easy to implement co-routines in Haskell&#8230; but only if you know how.  No fewer than three people asked me to blog about it, so here&#8217;s a quick guide to rolling your own co-routines.  To understand this blog, you will need to have a basic understanding of monad transformers.

There are co-routine packages [...]]]></description>
			<content:encoded><![CDATA[<p>It is easy to implement co-routines in Haskell&#8230; but only if you know how.  No fewer than three people asked me to blog about it, so here&#8217;s a quick guide to rolling your own co-routines.  To understand this blog, you will need to have a basic understanding of monad transformers.</p>

<p>There are co-routine packages on Hackage, but I have not had much luck with them. The point here, really, is to show you how it all works.</p>

<p><strong>What&#8217;s a co-routine?</strong></p>

<p>A co-routine (called a &#8216;generator&#8217; in Python) is where you create two interleaved flows of control on a single thread.  Unlike threads, co-routines switch co-operatively using a &#8216;yield&#8217; operation.  (This is quite a good trick in GUI programming for implementing complex workflow that spans multiple GUI events, since most GUI libraries require everything to be on one thread.)</p>

<p>The example I&#8217;m presenting here works in this way:  A caller executes a CoroutineT monad transformer, which adds the &#8216;yield&#8217; operation to the underlying monad (which can be anything).  From the caller&#8217;s point of view, the &#8216;yield&#8217; looks like the monad has returned, but with a continuation.  In the callee, &#8216;yield&#8217; appears to block until the caller executes the continuation.  In addition, we add the ability to pass a value in both directions.</p>

<p>So we&#8217;ve inverted the flow of control in the callee.  Continuation passing style (CPS) can also do this, but co-routines are better than CPS because 1. it&#8217;s a bit neater, and 2. it allows for recursion.</p>

<p>One application of co-routines is to separate I/O from logic.  By way of example I am going to implement an <em>expert system for identifying fruit</em>.  I try not to use contrived examples, and as you can see, this time I have completely failed.</p>

<p>In this example, the CoroutineT sits on top of the identity monad, so it&#8217;s pure, but the approach works just the same on top of IO or anything else.  This example is not deeply nested, but this approach happily supports any level of recursion or nesting.</p>

<p>So here&#8217;s our expert system logic. We&#8217;ll define <b>CoroutineT</b> shortly.  You can read &#8216;yield&#8217; as &#8216;askUser&#8217;:</p>

<p><pre><font color="Green"><u>type</u></font> Question <font color="Red">=</font> String
<font color="Green"><u>data</u></font> Answer <font color="Red">=</font> Y <font color="Red">|</font> N <font color="Green"><u>deriving</u></font> Eq
</pre>
<pre><font color="Green"><u>type</u></font> Expert a <font color="Red">=</font> CoroutineT Answer Question Identity a
</pre>
<pre><font color="Green"><u>data</u></font> Fruit
    <font color="Red">=</font> Apple
    <font color="Red">|</font> Kiwifruit
    <font color="Red">|</font> Banana
    <font color="Red">|</font> Orange
    <font color="Red">|</font> Lemon
    <font color="Green"><u>deriving</u></font> Show
</pre>
<pre><font color="Blue">identifyFruit</font> <font color="Red">::</font> Expert Fruit
<font color="Blue">identifyFruit</font> <font color="Red">=</font> <font color="Green"><u>do</u></font>
    yellow <font color="Red">&lt;-</font> yield <font color="Magenta">"Is it yellow?"</font>
    <font color="Green"><u>if</u></font> yellow <font color="Cyan">==</font> Y <font color="Green"><u>then</u></font> <font color="Green"><u>do</u></font>
        long <font color="Red">&lt;-</font> yield <font color="Magenta">"Is it long?"</font>
        <font color="Green"><u>if</u></font> long <font color="Cyan">==</font> Y <font color="Green"><u>then</u></font>
            return Banana
          <font color="Green"><u>else</u></font>
            return Lemon
      <font color="Green"><u>else</u></font> <font color="Green"><u>do</u></font>
        orange <font color="Red">&lt;-</font> yield <font color="Magenta">"Is it orange?"</font>
        <font color="Green"><u>if</u></font> orange <font color="Cyan">==</font> Y <font color="Green"><u>then</u></font>
           return Orange
         <font color="Green"><u>else</u></font> <font color="Green"><u>do</u></font>
           fuzzy <font color="Red">&lt;-</font> yield <font color="Magenta">"Is it fuzzy?"</font>
           <font color="Green"><u>if</u></font> fuzzy <font color="Cyan">==</font> Y <font color="Green"><u>then</u></font>
               return Kiwifruit
             <font color="Green"><u>else</u></font>
               return Apple
</pre></p>

<p>Our &#8216;Expert&#8217; type above&#8230;</p>

<p><pre><font color="Green"><u>type</u></font> Expert a <font color="Red">=</font> CoroutineT Answer Question Identity a
</pre></p>

<p>&#8230;specifies the type we are sending into our co-routine (Answer) and the type we are getting out of it (Question) as viewed from the caller.</p>

<p>Now we just need a main program to drive it.  Because the I/O is separated out, we can later replace this with a nice touch-screen GUI for the seriously fruit-impaired.</p>

<p><pre><font color="Blue">main</font> <font color="Red">::</font> IO ()
<font color="Blue">main</font> <font color="Red">=</font> <font color="Green"><u>do</u></font>
    putStrLn <font color="Cyan">$</font> <font color="Magenta">"Expert system for identifying fruit"</font>
    run identifyFruit
  <font color="Green"><u>where</u></font>
    run <font color="Red">::</font> Expert Fruit <font color="Red">-&gt;</font> IO ()
    run exp <font color="Red">=</font> handle <font color="Cyan">$</font> runIdentity <font color="Cyan">$</font> runCoroutineT exp
</pre>
<pre>    handle <font color="Cyan">(</font>Yield q cont<font color="Cyan">)</font> <font color="Red">=</font> <font color="Green"><u>do</u></font>
        putStrLn q
        l <font color="Red">&lt;-</font> getLine
        <font color="Green"><u>case</u></font> map toLower l <font color="Green"><u>of</u></font>
            <font color="Magenta">"y"</font>   <font color="Red">-&gt;</font> run <font color="Cyan">$</font> cont Y
            <font color="Magenta">"yes"</font> <font color="Red">-&gt;</font> run <font color="Cyan">$</font> cont Y
            <font color="Magenta">"n"</font>   <font color="Red">-&gt;</font> run <font color="Cyan">$</font> cont N
            <font color="Magenta">"no"</font>  <font color="Red">-&gt;</font> run <font color="Cyan">$</font> cont N
            <font color="Green"><u>_</u></font>   <font color="Red">-&gt;</font> putStrLn <font color="Magenta">"Please answer 'yes' or 'no'"</font> <font color="Cyan">&gt;&gt;</font> handle <font color="Cyan">(</font>Yield q cont<font color="Cyan">)</font>
</pre>
<pre>    handle <font color="Cyan">(</font>Result fruit<font color="Cyan">)</font> <font color="Red">=</font> <font color="Green"><u>do</u></font>
        putStrLn <font color="Cyan">$</font> <font color="Magenta">"The fruit you have is: "</font><font color="Cyan">++</font>show fruit
</pre></p>

<p>When we run our co-routine, it returns with one of these two events:</p>

<ul>
    <li><b>Yield</b>, which happens when yield is executed.  It gives us the output value (Question) and the continuation, which when passed the input value (Answer) gives us the same &#8216;Expert&#8217; type we started with.</li>
        <li><b>Result</b>, which happens when the co-routine has finished executing.
</ul>

<p><b>So how does CoroutineT work?</b></p>

<p>We&#8217;ll start with the types:</p>

<p><pre><font color="Green"><u>data</u></font> Result i o m a <font color="Red">=</font> Yield o <font color="Cyan">(</font>i <font color="Red">-&gt;</font> CoroutineT i o m a<font color="Cyan">)</font> <font color="Red">|</font> Result a
</pre>
<pre><font color="Blue"><i>-- | Co-routine monad transformer</i></font>
<font color="Blue"><i>--</i></font>
<font color="Blue"><i>--   * i = input value returned by yield</i></font>
<font color="Blue"><i>--</i></font>
<font color="Blue"><i>--   * o = output value, passed to yield</i></font>
<font color="Blue"><i>--</i></font>
<font color="Blue"><i>--   * m = next monad in stack</i></font>
<font color="Blue"><i>--</i></font>
<font color="Blue"><i>--   * a = monad return value</i></font>
<font color="Green"><u>data</u></font> CoroutineT i o m a <font color="Red">=</font> CoroutineT <font color="Cyan">{</font>
        runCoroutineT <font color="Red">::</font> m <font color="Cyan">(</font>Result i o m a<font color="Cyan">)</font>
    <font color="Cyan">}</font>
</pre></p>

<p>Hopefully that&#8217;s pretty straightforward.  &#8216;yield&#8217; is defined like this:</p>

<p><pre><font color="Blue"><i>-- | Suspend processing, returning a @o@ value and a continuation to the caller</i></font>
<font color="Blue">yield</font> <font color="Red">::</font> Monad m <font color="Red">=&gt;</font> o <font color="Red">-&gt;</font> CoroutineT i o m i
<font color="Blue">yield</font> o <font color="Red">=</font> CoroutineT <font color="Cyan">$</font> return <font color="Cyan">$</font> Yield o <font color="Cyan">(</font><font color="Red">\</font>i <font color="Red">-&gt;</font> CoroutineT <font color="Cyan">$</font> return <font color="Cyan">$</font> Result i<font color="Cyan">)</font>
</pre></p>

<p>The key point here is that the continuation does nothing except return the value, which is what we want it to do when we run a monad that contains only a yield.</p>

<p>Most of the magic is in the definition of &gt;&gt;=, thus:</p>

<p><pre><font color="Green"><u>instance</u></font> Monad m <font color="Red">=&gt;</font> Monad <font color="Cyan">(</font>CoroutineT i o m<font color="Cyan">)</font> <font color="Green"><u>where</u></font>
    return a <font color="Red">=</font> CoroutineT <font color="Cyan">$</font> return <font color="Cyan">$</font> Result a
    f <font color="Cyan">&gt;&gt;=</font> g <font color="Red">=</font> CoroutineT <font color="Cyan">$</font> <font color="Green"><u>do</u></font>
        res1 <font color="Red">&lt;-</font> runCoroutineT f
        <font color="Green"><u>case</u></font> res1 <font color="Green"><u>of</u></font>
            Yield o c <font color="Red">-&gt;</font> return <font color="Cyan">$</font> Yield o <font color="Cyan">(</font><font color="Red">\</font>i <font color="Red">-&gt;</font> c i <font color="Cyan">&gt;&gt;=</font> g<font color="Cyan">)</font>
            Result a  <font color="Red">-&gt;</font> runCoroutineT <font color="Cyan">(</font>g a<font color="Cyan">)</font>
    <font color="Blue"><i>-- Pass fail to next monad in the stack</i></font>
    fail err <font color="Red">=</font> CoroutineT <font color="Cyan">$</font> fail err
</pre></p>

<p>A typical monad would normally execute <b>f</b> then pass its result to <b>g</b> and execute that, and this is in fact exactly what we do in the <b>Result</b> case. Ho hum.</p>

<p>But there&#8217;s no law that says you have to execute <b>g</b>.  This is Haskell so we can do whatever we like.  <b>g</b> is just a plain old closure representing the continuation.</p>

<p>So what we do in the <b>Yield</b> case is take the continuation that executing <b>f</b> gave us, and bind that to the continuation <b>g</b>, then bail out of the monad (in the same way ErrorT does when it gets an error), handing our constructed continuation to the caller.  So we end up with a closure that represents the entire execution state of the monad, and it doesn&#8217;t matter how deeply nested we are.  It just puts the continuation together in the right way as we unravel everything on our way back to the caller.</p>

<p>Here&#8217;s the code in downloadable form:</p>

<ul>
    <li><a href="/blackh/expert.hs">expert.hs</a></li>
        <li><a href="/blackh/Coroutine.hs">Coroutine.hs</a></li>
</ul>

<p>This code is released in the public domain.</p>

<p><i>Stephen Blackheath, Manawatu, New Zealand</i></p>
]]></content:encoded>
			<wfw:commentRss>http://random.axman6.com/blog/?feed=rss2&amp;p=231</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Chunked XML parsing is the latest thing, you know</title>
		<link>http://random.axman6.com/blog/?p=157</link>
		<comments>http://random.axman6.com/blog/?p=157#comments</comments>
		<pubDate>Fri, 14 May 2010 14:15:43 +0000</pubDate>
		<dc:creator>blackh</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[GHC]]></category>
		<category><![CDATA[hackage]]></category>
		<category><![CDATA[parsing]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://random.axman6.com/blog/?p=157</guid>
		<description><![CDATA[Uhh, hello.  Welcome to my first blog post ever &#8211; and thanks Axman6 for letting me be a &#8220;guest blogger&#8221;.

It&#8217;s rather unfashionable on #haskell, but I like XML.  So, 18 months ago, I took over the hexpat package from Evan Martin.  It was going to be a small project &#8211; a simple XML parser binding [...]]]></description>
			<content:encoded><![CDATA[<p>Uhh, hello.  Welcome to my first blog post ever &#8211; and thanks Axman6 for letting me be a &#8220;guest blogger&#8221;.</p>

<p>It&#8217;s rather unfashionable on <strong>#haskell</strong>, but I like XML.  So, 18 months ago, I took over the <a href="http://hackage.haskell.org/package/hexpat"><em>hexpat</em></a> package from Evan Martin.  It was going to be a small project &#8211; a simple XML parser binding to <a href="http://expat.sourceforge.net/">Expat</a>.  The fastest Haskell XML parser alive.  Or so I thought.</p>

<p>It&#8217;s become a passion, a way of life.  It&#8217;s XML parsing in Haskell the way <em>I</em> think it should be done.  The best as well as the fastest.  (I like to think big.)</p>

<p>I&#8217;ve finally finished adding all the features that I and a number of <a href="http://hackage.haskell.org/package/hexpat">contributors</a> wanted, and I would now like to announce that <strong>hexpat is going beta</strong>.  I want to make this package really, really good, so please help by testing and critiquing.  I want to stabilize <em>hexpat</em>, but <em>hexpat-iteratee</em> will be unstable for a while yet.</p>

<p><strong>The future is chunky</strong></p>

<p>The cherry on top of the <em>hexpat</em> galaxy is the still experimental <a href="http://hackage.haskell.org/package/hexpat-iteratee"><em>hexpat-iteratee</em></a> based on Oleg Kiselyov&#8217;s <em>iteratee</em>, which is a bit of a hot ticket these days.  It provides lazy XML parsing without the practical issues and philosophical dodginess inherent in Haskell&#8217;s lazy I/O through functions like <em>hGetContents</em>.</p>

<p><em>hexpat-iteratee</em> allows for effectful XML processing done in a functional way, and the magic behind this is Yair Chuchem&#8217;s humbly named <em>List</em> package.  It is &#8220;merely&#8221; a generalization of lists, and I think it deserves to be a common piece of infrastructure.</p>

<p>The example project is a <strong>chunked XML-over-TCP movie database lookup server</strong>.  Every home should have one.  So, let&#8217;s start like all good blogs do, with imports:
<pre><span class="hs-comment">{-# LANGUAGE OverloadedStrings #-}</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Control</span><span class="hs-varop">.</span><span class="hs-conid">Concurrent</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Control</span><span class="hs-varop">.</span><span class="hs-conid">Exception</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Control</span><span class="hs-varop">.</span><span class="hs-conid">Monad</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Control</span><span class="hs-varop">.</span><span class="hs-conid">Monad</span><span class="hs-varop">.</span><span class="hs-conid">IO</span><span class="hs-varop">.</span><span class="hs-conid">Class</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Control</span><span class="hs-varop">.</span><span class="hs-conid">Monad</span><span class="hs-varop">.</span><span class="hs-conid">ListT</span>
<span class="hs-keyword">import</span> <span class="hs-keyword">qualified</span> <span class="hs-conid">Data</span><span class="hs-varop">.</span><span class="hs-conid">ByteString</span> <span class="hs-keyword">as</span> <span class="hs-conid">B</span>
<span class="hs-keyword">import</span> <span class="hs-keyword">qualified</span> <span class="hs-conid">Data</span><span class="hs-varop">.</span><span class="hs-conid">ByteString</span><span class="hs-varop">.</span><span class="hs-conid">Unsafe</span> <span class="hs-keyword">as</span> <span class="hs-conid">B</span> <span class="hs-layout">(</span><span class="hs-varid">unsafeUseAsCStringLen</span><span class="hs-layout">)</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Data</span><span class="hs-varop">.</span><span class="hs-conid">Iteratee</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Data</span><span class="hs-varop">.</span><span class="hs-conid">Iteratee</span><span class="hs-varop">.</span><span class="hs-conid">IO</span><span class="hs-varop">.</span><span class="hs-conid">Fd</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Data</span><span class="hs-varop">.</span><span class="hs-conid">Iteratee</span><span class="hs-varop">.</span><span class="hs-conid">WrappedByteString</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Data</span><span class="hs-varop">.</span><span class="hs-conid">List</span><span class="hs-varop">.</span><span class="hs-conid">Class</span> <span class="hs-keyword">as</span> <span class="hs-conid">List</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Data</span><span class="hs-varop">.</span><span class="hs-conid">Maybe</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Data</span><span class="hs-varop">.</span><span class="hs-conid">Text</span> <span class="hs-layout">(</span><span class="hs-conid">Text</span><span class="hs-layout">)</span>
<span class="hs-keyword">import</span> <span class="hs-keyword">qualified</span> <span class="hs-conid">Data</span><span class="hs-varop">.</span><span class="hs-conid">Text</span> <span class="hs-keyword">as</span> <span class="hs-conid">T</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Network</span>
<span class="hs-keyword">import</span> <span class="hs-conid">System</span><span class="hs-varop">.</span><span class="hs-conid">IO</span>
<span class="hs-keyword">import</span> <span class="hs-conid">System</span><span class="hs-varop">.</span><span class="hs-conid">Posix</span><span class="hs-varop">.</span><span class="hs-conid">IO</span> <span class="hs-layout">(</span><span class="hs-varid">handleToFd</span><span class="hs-layout">,</span> <span class="hs-varid">fdWriteBuf</span><span class="hs-layout">,</span> <span class="hs-varid">closeFd</span><span class="hs-layout">)</span>
<span class="hs-keyword">import</span> <span class="hs-conid">System</span><span class="hs-varop">.</span><span class="hs-conid">Posix</span><span class="hs-varop">.</span><span class="hs-conid">Types</span> <span class="hs-layout">(</span><span class="hs-conid">Fd</span><span class="hs-layout">)</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Text</span><span class="hs-varop">.</span><span class="hs-conid">XML</span><span class="hs-varop">.</span><span class="hs-conid">Expat</span><span class="hs-varop">.</span><span class="hs-conid">Chunked</span>
<span class="hs-keyword">import</span> <span class="hs-keyword">qualified</span> <span class="hs-conid">Text</span><span class="hs-varop">.</span><span class="hs-conid">XML</span><span class="hs-varop">.</span><span class="hs-conid">Expat</span><span class="hs-varop">.</span><span class="hs-conid">Chunked</span> <span class="hs-keyword">as</span> <span class="hs-conid">Tree</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Text</span><span class="hs-varop">.</span><span class="hs-conid">XML</span><span class="hs-varop">.</span><span class="hs-conid">Expat</span><span class="hs-varop">.</span><span class="hs-conid">Format</span>
<span class="hs-keyword">import</span> <span class="hs-conid">Foreign</span><span class="hs-varop">.</span><span class="hs-conid">Ptr</span></pre>
The first thing we want to do is listen on a socket.  I could use handles, sockets, or file descriptors.  With handles, this  code does not work interactively.  Disabling the buffering does not seem  to work at all in GHC 6.10 or 6.12.  Sockets would be ideal, but to  save me writing an <em>iteratee</em> driver, I&#8217;m left with file  descriptors which unfortunately means this code only works on GHC 6.12  on a POSIX system.  <strong>fdPutStrBS</strong> is the only glue I need then &#8211; it  writes a ByteString to a Fd.   Here&#8217;s the code:
<pre><span class="hs-definition">main</span> <span class="hs-keyglyph">::</span> <span class="hs-conid">IO</span> <span class="hs-conid">()</span>
<span class="hs-definition">main</span> <span class="hs-keyglyph">=</span> <span class="hs-keyword">do</span>
    <span class="hs-keyword">let</span> <span class="hs-varid">port</span> <span class="hs-keyglyph">=</span> <span class="hs-num">6333</span>
    <span class="hs-varid">putStrLn</span> <span class="hs-varop">$</span> <span class="hs-str">"listening on port "</span><span class="hs-varop">++</span><span class="hs-varid">show</span> <span class="hs-varid">port</span>
    <span class="hs-varid">ls</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">listenOn</span> <span class="hs-varop">$</span> <span class="hs-conid">PortNumber</span> <span class="hs-varid">port</span>
    <span class="hs-varid">forever</span> <span class="hs-varop">$</span> <span class="hs-keyword">do</span>
        <span class="hs-layout">(</span><span class="hs-varid">h</span><span class="hs-layout">,</span> <span class="hs-keyword">_</span><span class="hs-layout">,</span> <span class="hs-keyword">_</span><span class="hs-layout">)</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">accept</span> <span class="hs-varid">ls</span>
        <span class="hs-varid">forkIO</span> <span class="hs-varop">$</span> <span class="hs-varid">handleToFd</span> <span class="hs-varid">h</span> <span class="hs-varop">&gt;&gt;=</span> <span class="hs-keyglyph">\</span><span class="hs-varid">fd</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-keyword">do</span>
            <span class="hs-varid">iter</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">parse</span> <span class="hs-varid">defaultParserOptions</span> <span class="hs-layout">(</span><span class="hs-varid">session</span> <span class="hs-layout">(</span><span class="hs-varid">fdPutStrBS</span> <span class="hs-varid">fd</span><span class="hs-layout">)</span><span class="hs-layout">)</span>
            <span class="hs-varid">result</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">enumFd</span> <span class="hs-varid">fd</span> <span class="hs-varid">iter</span> <span class="hs-varop">&gt;&gt;=</span> <span class="hs-varid">run</span>
            <span class="hs-varid">print</span> <span class="hs-varid">result</span>
          <span class="hs-varop">&#96;finally&#96;</span>
            <span class="hs-varid">closeFd</span> <span class="hs-varid">fd</span></p>

<p><span class="hs-definition">fdPutStrBS</span> <span class="hs-keyglyph">::</span> <span class="hs-conid">Fd</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-conid">B</span><span class="hs-varop">.</span><span class="hs-conid">ByteString</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-conid">IO</span> <span class="hs-conid">()</span>
<span class="hs-definition">fdPutStrBS</span> <span class="hs-varid">fd</span> <span class="hs-varid">bs</span> <span class="hs-keyglyph">=</span> <span class="hs-conid">B</span><span class="hs-varop">.</span><span class="hs-varid">unsafeUseAsCStringLen</span> <span class="hs-varid">bs</span> <span class="hs-varop">$</span> <span class="hs-keyglyph">\</span><span class="hs-layout">(</span><span class="hs-varid">buf</span><span class="hs-layout">,</span> <span class="hs-varid">len</span><span class="hs-layout">)</span> <span class="hs-keyglyph">-&gt;</span>
        <span class="hs-varid">writeFully</span> <span class="hs-layout">(</span><span class="hs-varid">castPtr</span> <span class="hs-varid">buf</span><span class="hs-layout">)</span> <span class="hs-layout">(</span><span class="hs-varid">fromIntegral</span> <span class="hs-varid">len</span><span class="hs-layout">)</span>
  <span class="hs-keyword">where</span>
    <span class="hs-varid">writeFully</span> <span class="hs-keyword">_</span> <span class="hs-varid">len</span> <span class="hs-keyglyph">|</span> <span class="hs-varid">len</span> <span class="hs-varop">==</span> <span class="hs-num">0</span> <span class="hs-keyglyph">=</span> <span class="hs-varid">return</span> <span class="hs-conid">()</span>
    <span class="hs-varid">writeFully</span> <span class="hs-varid">buf</span> <span class="hs-varid">len</span> <span class="hs-keyglyph">=</span> <span class="hs-keyword">do</span>
        <span class="hs-varid">written</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">fdWriteBuf</span> <span class="hs-varid">fd</span> <span class="hs-varid">buf</span> <span class="hs-varid">len</span>
        <span class="hs-keyword">if</span> <span class="hs-varid">written</span> <span class="hs-varop">&lt;</span> <span class="hs-num">0</span>
            <span class="hs-keyword">then</span> <span class="hs-varid">fail</span> <span class="hs-str">"write failed"</span>
            <span class="hs-keyword">else</span> <span class="hs-varid">writeFully</span> <span class="hs-layout">(</span><span class="hs-varid">buf</span> <span class="hs-varop">&#96;plusPtr&#96;</span> <span class="hs-varid">fromIntegral</span> <span class="hs-varid">written</span><span class="hs-layout">)</span> <span class="hs-layout">(</span><span class="hs-varid">len</span> <span class="hs-comment">-</span> <span class="hs-varid">written</span><span class="hs-layout">)</span></pre>
Once we&#8217;ve accepted the connection, we get <strong>parse</strong> (from <em>hexpat-iteratee</em>) to make us an iteratee.  The second argument, &#8220;session (fdPutStrBS fd)&#8221; is the handler for processing the document.  We then pass this iteratee to <em>iteratee</em>&#8217;s <strong>enumFd</strong>, whose job it is to pull the input data out of the Fd and feed it into the parser. <strong>parse</strong> is monadic in order that it can start the handler before it receives the first data block through the iteratee. This is necessary in case the handler wants to generate output before it gets any input, which we want to do here.</p>

<p>The handler is a co-routine.  When it runs out of input data, it gets suspended, and control returns to <strong>enumFd</strong>.
<pre><span class="hs-definition">session</span> <span class="hs-keyglyph">::</span> <span class="hs-layout">(</span><span class="hs-conid">B</span><span class="hs-varop">.</span><span class="hs-conid">ByteString</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-conid">IO</span> <span class="hs-conid">()</span><span class="hs-layout">)</span>  <span class="hs-comment">-- ^ Write output data to socket</span>
        <span class="hs-keyglyph">-&gt;</span> <span class="hs-conid">ListOf</span> <span class="hs-layout">(</span><span class="hs-conid">UNode</span> <span class="hs-conid">IO</span> <span class="hs-conid">Text</span><span class="hs-layout">)</span>   <span class="hs-comment">-- ^ Input XML document</span>
        <span class="hs-keyglyph">-&gt;</span> <span class="hs-conid">XMLT</span> <span class="hs-conid">IO</span> <span class="hs-conid">()</span>
<span class="hs-definition">session</span> <span class="hs-varid">writeOut</span> <span class="hs-varid">inputXML</span> <span class="hs-keyglyph">=</span> <span class="hs-keyword">do</span>
    <span class="hs-keyword">let</span> <span class="hs-varid">outputXML</span> <span class="hs-keyglyph">=</span> <span class="hs-varid">formatG</span> <span class="hs-varop">$</span> <span class="hs-varid">indent</span> <span class="hs-num">2</span> <span class="hs-varop">$</span> <span class="hs-conid">Element</span> <span class="hs-str">"server"</span> <span class="hs-conid">[]</span> <span class="hs-layout">(</span><span class="hs-varid">processRoot</span> <span class="hs-varid">inputXML</span><span class="hs-layout">)</span>
    <span class="hs-varid">execute</span> <span class="hs-varop">$</span> <span class="hs-varid">liftIO</span> <span class="hs-varop">.</span> <span class="hs-varid">writeOut</span> <span class="hs-varop">=&lt;&lt;</span> <span class="hs-varid">outputXML</span>
    <span class="hs-varid">return</span> <span class="hs-conid">()</span></pre>
<strong>formatG</strong> is a <em>hexpat</em> function to take a tree node and format it as XML, returning one of Yair&#8217;s Lists of ByteStrings.  <strong>indent</strong> is a filter that adds pretty indenting.  The Element is the top level tag of our output XML tree, and its third argument &#8220;processRoot inputXML&#8221; evaluates the child nodes of the output document.  The entire processing of the document is in a functional style.</p>

<p><strong>execute</strong> here makes all the IO actually happen.  It iterates over a List of monadic actions and sequences them.  This translates into a sequence of writes of data blocks to the socket.  The elements in the list are monadic, so <strong>execute</strong> also must execute those in order to extract each output <strong>ByteString</strong>.</p>

<p>In this way, even though <strong>processRoot</strong> is pure at the top level, it can contain effectful computations.
<pre><span class="hs-definition">processRoot</span> <span class="hs-keyglyph">::</span> <span class="hs-conid">ListOf</span> <span class="hs-layout">(</span><span class="hs-conid">UNode</span> <span class="hs-conid">IO</span> <span class="hs-conid">Text</span><span class="hs-layout">)</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-conid">ListOf</span> <span class="hs-layout">(</span><span class="hs-conid">UNode</span> <span class="hs-conid">IO</span> <span class="hs-conid">Text</span><span class="hs-layout">)</span>
<span class="hs-definition">processRoot</span> <span class="hs-varid">root</span> <span class="hs-keyglyph">=</span> <span class="hs-keyword">do</span>
    <span class="hs-conid">Element</span> <span class="hs-keyword">_</span> <span class="hs-keyword">_</span> <span class="hs-varid">children</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">genericTake</span> <span class="hs-num">1</span> <span class="hs-varid">root</span>
    <span class="hs-varid">child</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">children</span>
    <span class="hs-varid">extractElements</span> <span class="hs-varid">child</span>
  <span class="hs-keyword">where</span>
    <span class="hs-varid">extractElements</span> <span class="hs-keyglyph">::</span> <span class="hs-conid">UNode</span> <span class="hs-conid">IO</span> <span class="hs-conid">Text</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-conid">ListOf</span> <span class="hs-layout">(</span><span class="hs-conid">UNode</span> <span class="hs-conid">IO</span> <span class="hs-conid">Text</span><span class="hs-layout">)</span>
    <span class="hs-varid">extractElements</span> <span class="hs-varid">elt</span> <span class="hs-keyglyph">|</span> <span class="hs-varid">isElement</span> <span class="hs-varid">elt</span> <span class="hs-keyglyph">=</span> <span class="hs-varid">processCommand</span> <span class="hs-varid">elt</span> <span class="hs-varop">&#96;cons&#96;</span> <span class="hs-varid">mzero</span>
    <span class="hs-varid">extractElements</span> <span class="hs-keyword">_</span>                   <span class="hs-keyglyph">=</span> <span class="hs-varid">mzero</span></pre>
<strong>ListOf</strong> is a type function that conceals a long-winded type  name.  This function maps the input document to a list of output nodes.</p>

<p>The root of the input document is actually given as a <em>List</em> containing one item &#8211; the top-level XML tag.  The reason why we do this is so that we have to ask for it to be pulled.  If it were just passed as a <strong>UNode IO Text</strong> type, we would have to calculate it before the handler was called, and the handler wouldn&#8217;t get a chance to do output before it requests input.</p>

<p>The function is implemented using List&#8217;s Monad instance, which behaves exactly like a list monad.  The reason for <strong>genericTake 1 root</strong> is so we stop processing after the root node and don&#8217;t wait for a node that will never come.  I need to fix this in <em>hexpat-iteratee</em>.</p>

<p><strong>&#96;cons&#96;</strong> is the generalized list cons operator like <strong>:</strong> and  <strong>&#96;mzero&#96;</strong> corresponds to [].
<pre><span class="hs-definition">processCommand</span> <span class="hs-keyglyph">::</span> <span class="hs-conid">UNode</span> <span class="hs-conid">IO</span> <span class="hs-conid">Text</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-conid">UNode</span> <span class="hs-conid">IO</span> <span class="hs-conid">Text</span>
<span class="hs-definition">processCommand</span> <span class="hs-varid">elt</span><span class="hs-keyglyph">@</span><span class="hs-layout">(</span><span class="hs-conid">Element</span> <span class="hs-str">"title"</span> <span class="hs-keyword">_</span> <span class="hs-keyword">_</span><span class="hs-layout">)</span> <span class="hs-keyglyph">=</span> <span class="hs-conid">Element</span> <span class="hs-str">"title"</span> <span class="hs-conid">[]</span> <span class="hs-varop">$</span> <span class="hs-varid">joinL</span> <span class="hs-varop">$</span> <span class="hs-keyword">do</span>
    <span class="hs-varid">txt</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">textContentM</span> <span class="hs-varid">elt</span>
    <span class="hs-varid">return</span> <span class="hs-varop">$</span> <span class="hs-varid">search</span> <span class="hs-varid">txt</span>
<span class="hs-definition">processCommand</span> <span class="hs-layout">(</span><span class="hs-conid">Element</span> <span class="hs-varid">cmd</span> <span class="hs-keyword">_</span> <span class="hs-keyword">_</span><span class="hs-layout">)</span> <span class="hs-keyglyph">=</span> <span class="hs-conid">Element</span> <span class="hs-str">"unknown"</span> <span class="hs-keyglyph">[</span><span class="hs-layout">(</span><span class="hs-str">"command"</span><span class="hs-layout">,</span> <span class="hs-varid">cmd</span><span class="hs-layout">)</span><span class="hs-keyglyph">]</span> <span class="hs-varid">mzero</span></pre>
Here is our command processor.  We have one command <strong>&lt;title&gt;foo&lt;/title&gt;</strong> that finds all movies whose titles contain <strong>foo</strong>.</p>

<p><strong>joinL</strong> is a bit of List magic that lets you drop down into the underlying monad, which in this case is <strong>XMLT IO a</strong>.  joinL&#8217;s type is <strong>::  ItemM l (l a) -&gt; l a </strong>where <strong>ItemM l</strong> is a type function giving the list&#8217;s monad.  So, the stuff after joinL resolves to a type of <strong>:: XMLT IO (ListOf (UNode IO Text))</strong>.
<pre><span class="hs-definition">search</span> <span class="hs-keyglyph">::</span> <span class="hs-conid">Text</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-conid">ListOf</span> <span class="hs-layout">(</span><span class="hs-conid">UNode</span> <span class="hs-conid">IO</span> <span class="hs-conid">Text</span><span class="hs-layout">)</span>
<span class="hs-definition">search</span> <span class="hs-varid">key</span> <span class="hs-keyglyph">=</span> <span class="hs-varid">joinL</span> <span class="hs-varop">$</span> <span class="hs-keyword">do</span>
    <span class="hs-varid">iter</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">liftIO</span> <span class="hs-varop">$</span> <span class="hs-varid">parse</span> <span class="hs-varid">defaultParserOptions</span> <span class="hs-varop">$</span> <span class="hs-keyglyph">\</span><span class="hs-varid">root</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-keyword">do</span>
        <span class="hs-keyword">let</span> <span class="hs-varid">l</span> <span class="hs-keyglyph">=</span> <span class="hs-keyword">do</span>
                <span class="hs-varid">elt</span><span class="hs-keyglyph">@</span><span class="hs-layout">(</span><span class="hs-conid">Element</span> <span class="hs-keyword">_</span> <span class="hs-keyword">_</span> <span class="hs-varid">children</span><span class="hs-layout">)</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">genericTake</span> <span class="hs-num">1</span> <span class="hs-varid">root</span>
                <span class="hs-varid">movie</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-conid">List</span><span class="hs-varop">.</span><span class="hs-varid">filter</span> <span class="hs-varid">isElement</span> <span class="hs-varid">children</span>
                <span class="hs-varid">return</span> <span class="hs-varid">movie</span>
        <span class="hs-varid">execute</span> <span class="hs-varid">l</span>
        <span class="hs-varid">return</span> <span class="hs-varid">l</span>
    <span class="hs-varid">eMovies</span> <span class="hs-keyglyph">&lt;-</span> <span class="hs-varid">liftIO</span> <span class="hs-varop">$</span> <span class="hs-varid">fileDriver</span> <span class="hs-varid">iter</span> <span class="hs-str">"movies.xml"</span>
    <span class="hs-keyword">case</span> <span class="hs-varid">eMovies</span> <span class="hs-keyword">of</span>
        <span class="hs-conid">Left</span> <span class="hs-varid">err</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-varid">fail</span> <span class="hs-varop">$</span> <span class="hs-str">"failed to read 'movies.xml': "</span><span class="hs-varop">++</span><span class="hs-varid">show</span> <span class="hs-varid">err</span>
        <span class="hs-conid">Right</span> <span class="hs-varid">movies</span> <span class="hs-keyglyph">-&gt;</span> <span class="hs-varid">return</span> <span class="hs-varop">$</span> <span class="hs-conid">List</span><span class="hs-varop">.</span><span class="hs-varid">filter</span> <span class="hs-varid">matches</span> <span class="hs-varid">movies</span>
  <span class="hs-keyword">where</span>
    <span class="hs-varid">matches</span> <span class="hs-varid">elt</span> <span class="hs-keyglyph">=</span> <span class="hs-varid">key</span> <span class="hs-varop">&#96;</span><span class="hs-conid">T</span><span class="hs-varop">.</span><span class="hs-varid">isInfixOf</span><span class="hs-varop">&#96;</span> <span class="hs-varid">fromMaybe</span> <span class="hs-str">""</span> <span class="hs-layout">(</span><span class="hs-varid">getAttribute</span> <span class="hs-varid">elt</span> <span class="hs-str">"title"</span><span class="hs-layout">)</span></pre>
Here&#8217;s where our handler does some real I/O.  We read our database from a flat file using the same method of parsing.  Passing possibly unexecuted nodes outside the XMLT monad is a bit wrong, and needs to be addressed in the design, but here it works as long as I <strong>execute</strong> them.  Alternatively a pure XML parse would work.  <em>hexpat</em> has  functions to convert between pure and monadic node types.</p>

<p>So, I build and run the server, and here is the result, using Unix&#8217;s <strong>nc</strong> command as my client.  I typed this:
<pre>&lt;a&gt;
&lt;title&gt;of the&lt;/title&gt;</pre>
The output is:
<pre>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;server&gt;
 &lt;title&gt;
   &lt;movie id="dvzrwfvryd" disc="41" title="War of the Worlds (2005)"
        director="Steven Spielberg" genre="Sci Fi Thriller" rating="6"
        description="Tom Cruise alert" imdbID="tt0407304"/&gt;
   &lt;movie id="xxvjgxpokp" disc="44" title="Shaun of the Dead"
        director="Edgar Wright" genre="Comedy Horror" rating="8"
        description="British send-up zombie movie" imdbID="tt0365748"/&gt;
   &lt;movie id="duvcjsygqi" disc="104" title="March of the Penguins (La Marche de l&amp;apos;empereur)"
        director="Luc Jacquet" genre="Documentary" description="" imdbID="tt0428803"/&gt;
   &lt;movie id="dawcezoiro" disc="109" title="Pirates of the Caribbean: Dead Man&amp;apos;s Chest"
        director="Gore Verbinski" genre="Action/Comedy" rating="7" description="" imdbID="tt0383574"/&gt;
 &lt;/title&gt;</pre>
(New lines added for readability)</p>

<p>And the session can process more commands interactively.</p>

<p><strong>And pickled</strong></p>

<p>I should also mention my related <a href="http://hackage.haskell.org/package/hexpat-pickle">hexpat-pickle</a> package which is a shameless rip-off of the picklers from Uwe Schmidt&#8217;s excellent <a href="http://hackage.haskell.org/package/hxt">hxt</a> package.  I find it a very practical and quick way to bang out XML picklers.  (It doesn&#8217;t work with <em>hexpat-iteratee</em> yet.)</p>

<p><strong>Bye bye</strong></p>

<p>Here&#8217;s <a href="http://hip-to-be-square.com/~blackh/hexpat-iteratee-blog/">the code</a> in downloadable form.  Make sure you use the <strong>monads-fd</strong> and <strong>transformers</strong> packages instead of <strong>mtl</strong>.  Also <strong>hexpat-iteratee</strong> and <strong>text</strong>.</p>

<p>I hope you found this interesting.  I hope the XML haters of <strong>#haskell</strong> will be miraculously transformed into XML tolerators, and I hope you&#8217;ll help me improve <em>hexpat</em>. <em>&#8211; Stephen Blackheath, Manawatu, New Zealand</em></p>
]]></content:encoded>
			<wfw:commentRss>http://random.axman6.com/blog/?feed=rss2&amp;p=157</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
