<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[Christopher Maier's Technical Blog]]></title>
  <link href="http://christophermaier.name/atom.xml" rel="self"/>
  <link href="http://christophermaier.name/"/>
  <updated>2012-01-21T13:16:18-05:00</updated>
  <id>http://christophermaier.name/</id>
  <author>
    <name><![CDATA[Christopher Maier]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[Creating a Query DSL using Clojure and MongoDB]]></title>
    <link href="http://christophermaier.name/blog/2011/07/17/creating-a-query-dsl-using-clojure-and-mongodb/"/>
    <updated>2011-07-17T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2011/07/17/creating-a-query-dsl-using-clojure-and-mongodb</id>
    <content type="html"><![CDATA[<p>One of the nice things about <a href="http://www.mongodb.org">MongoDB</a> (particularly when using it in Clojure via the <a href="https://github.com/aboekhoff/congomongo">Congomongo</a> library) is that its map-based query language is so amenable to the creation of a domain-specific language, or <a href="http://en.wikipedia.org/wiki/Domain-specific_language">DSL</a>.   Creating and manipulating maps is like breathing in Clojure, so it is trivial to decompose the different query requirements of your application into a small collection of simple functions that can be used to create a rather fluent domain-specific language.  The data-structure-based query language of MongoDB makes this possible (or at least easier; it would be much more difficult to do in a string-based language like SQL).</p>

<p>Not only does creating a DSL make querying easy (particularly with complex conditions), but it also insulates your application from change in a few important ways.  Especially in the initial, exploratory stages of a project, it is common to change and evolve a data schema, and NoSQL environments make this very simple.  Using a DSL will shield your code from these changes; you only need to change the DSL &#8220;atoms&#8221; that the schema change affects.</p>

<p>Another benefit is that you can more easily change out your underlying database when and if the need arises.  With SQL databases, this is not as big of a problem.  SQL is a standard, and we have things like JDBC to provide (more or less) equivalent interaction with SQL databases (yes, reality is more complicated, but we&#8217;re comparing to swapping out one NoSQL database for another).  There is no corresponding &#8220;NoSQL standard&#8221;, but even if there were, there are so many different kinds of NoSQL databases (document, graph, key-value, column store, etc.) that there probably <em>can&#8217;t</em> be any sort of meaningful general abstraction like JDBC that covers them all.  However, when you create a query DSL, you don&#8217;t need to create a completely general abstraction over your underlying database; you just need one that works for your project.</p>

<p>I recently implemented a simple DSL for a project at work that we use for querying complex proteomics and genomics data.  I&#8217;ll illustrate a small bit of the DSL here to describe the general approach and show some of the benefits.</p>

<h1>Background</h1>

<p>In a nutshell, we&#8217;re querying to find certain features within the human genome.  The raw data are called &#8220;peptide / spectrum matches&#8221;, or a &#8220;PSMs&#8221;.  They have sequences, scores, and genomic coordinates, among other things, and we query to find PSMs based on various combinations of these criteria.  We store the data in MongoDB, with one document per PSM, and query using Congomongo.</p>

<p>If you want to find all PSMs that have a particular peptide sequence, you&#8217;d have a query map like this;</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='clojure'><span class='line'><span class="p">{</span><span class="nv">:peptide</span><span class="o">.</span><span class="nv">sequence</span> <span class="s">&quot;GLYQRPHDSTRFK&quot;</span><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>If you want to further restrict that to only results that have an expectation value of no greater than 0.01, you&#8217;d use this:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='clojure'><span class='line'><span class="p">{</span><span class="nv">:peptide</span><span class="o">.</span><span class="nv">sequence</span> <span class="s">&quot;GLYQRPHDSTRFK&quot;</span>
</span><span class='line'> <span class="nv">:scores</span><span class="o">.</span><span class="nv">e-value</span> <span class="p">{</span><span class="nv">:$lte</span> <span class="mf">0.01</span><span class="p">}}</span>
</span></code></pre></td></tr></table></div></figure>


<p>Further restricting results to lying within a region of a chromosome would be done like this:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='clojure'><span class='line'><span class="p">{</span><span class="nv">:peptide</span><span class="o">.</span><span class="nv">sequence</span> <span class="s">&quot;GLYQRPHDSTRFK&quot;</span>
</span><span class='line'> <span class="nv">:scores</span><span class="o">.</span><span class="nv">e-value</span> <span class="p">{</span><span class="nv">:$lte</span> <span class="mf">0.01</span><span class="p">}</span>
</span><span class='line'> <span class="nv">:location</span><span class="o">.</span><span class="nv">chromosome</span> <span class="s">&quot;X&quot;</span>
</span><span class='line'> <span class="nv">:location</span><span class="o">.</span><span class="nv">strand</span> <span class="s">&quot;+&quot;</span>
</span><span class='line'> <span class="nv">:location</span><span class="o">.</span><span class="nv">start</span> <span class="p">{</span><span class="nv">:$gte</span> <span class="mi">12345</span><span class="p">}</span>
</span><span class='line'> <span class="nv">:location</span><span class="o">.</span><span class="nv">stop</span> <span class="p">{</span><span class="nv">:$lte</span> <span class="mi">34567</span><span class="p">}}</span>
</span></code></pre></td></tr></table></div></figure>


<h2>Creating the DSL</h2>

<p>In reality, there are many more criteria, but by this point a pattern suggests itself.  Each individual criterion will be a map, while each query will be a simple merging of these maps.  Let&#8217;s start with the <code>query</code> function first, which we&#8217;ll use to generate the final query map (not actually perform the query).</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='clojure'><span class='line'><span class="p">(</span><span class="k">defn </span><span class="nv">query</span> <span class="p">[</span><span class="nv">&amp;</span> <span class="nv">criteria</span><span class="p">]</span>
</span><span class='line'>  <span class="p">(</span><span class="nb">apply </span><span class="nv">merge</span> <span class="nv">criteria</span><span class="p">))</span>
</span></code></pre></td></tr></table></div></figure>


<p>That&#8217;s it.  Now for the rest of the criteria:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class='clojure'><span class='line'><span class="p">(</span><span class="k">defn </span><span class="nv">matches-peptide</span> <span class="p">[</span><span class="nv">peptide</span><span class="p">]</span>
</span><span class='line'>  <span class="p">{</span><span class="nv">:peptide</span><span class="o">.</span><span class="nv">sequence</span> <span class="nv">peptide</span><span class="p">})</span>
</span><span class='line'>
</span><span class='line'><span class="p">(</span><span class="k">defn </span><span class="nv">with-e-value-cutoff</span> <span class="p">[</span><span class="nv">cutoff</span><span class="p">]</span>
</span><span class='line'>  <span class="p">{</span><span class="nv">:scores</span><span class="o">.</span><span class="nv">e-value</span> <span class="p">{</span><span class="nv">:$lte</span> <span class="nv">cutoff</span><span class="p">}})</span>
</span><span class='line'>
</span><span class='line'><span class="p">(</span><span class="k">defn </span><span class="nv">in-region</span> <span class="p">[{</span><span class="nv">:keys</span> <span class="p">[</span><span class="nv">chromosome</span> <span class="nv">strand</span> <span class="nv">start</span> <span class="nv">stop</span><span class="p">]}]</span>
</span><span class='line'>  <span class="p">{</span><span class="nv">:location</span><span class="o">.</span><span class="nv">chromosome</span> <span class="nv">chromosome</span>
</span><span class='line'>   <span class="nv">:location</span><span class="o">.</span><span class="nv">strand</span> <span class="nv">strand</span>
</span><span class='line'>   <span class="nv">:location</span><span class="o">.</span><span class="nv">start</span> <span class="p">{</span><span class="nv">:$gte</span> <span class="nv">start</span><span class="p">}</span>
</span><span class='line'>   <span class="nv">:location</span><span class="o">.</span><span class="nv">stop</span> <span class="p">{</span><span class="nv">:$lte</span> <span class="nv">stop</span><span class="p">}})</span>
</span></code></pre></td></tr></table></div></figure>


<p>All very straightforward.  Now, when we want to create a final query, we write something like this:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='clojure'><span class='line'><span class="p">(</span><span class="nf">query</span> <span class="p">(</span><span class="nf">matching-peptide</span> <span class="s">&quot;GLYQRPHDSTRFK&quot;</span><span class="p">)</span>
</span><span class='line'>       <span class="p">(</span><span class="nf">with-e-value-cutoff</span> <span class="mf">0.001</span><span class="p">)</span>
</span><span class='line'>       <span class="p">(</span><span class="nf">in-region</span> <span class="p">{</span><span class="nv">:chromosome</span> <span class="s">&quot;X&quot;</span>
</span><span class='line'>                   <span class="nv">:strand</span> <span class="s">&quot;+&quot;</span>
</span><span class='line'>                   <span class="nv">:start</span> <span class="mi">12345</span>
</span><span class='line'>                   <span class="nv">:stop</span> <span class="mi">34567</span><span class="p">}))</span>
</span></code></pre></td></tr></table></div></figure>


<p>That&#8217;s pretty readable.  We&#8217;ve gained a lot of flexibility, too, since we&#8217;ve decoupled the <em>semantic</em> meaning of a query from the underlying <em>syntactic</em> realities of my data schema and database.  We&#8217;re free to change how we structure the underlying data (something we&#8217;ve already done several times in the course of this project!).  For instance, maybe we&#8217;ll want to represent a peptide as a plain String instead of a complex object like we have here.  We only need to change one line of code for the queries to keep working.</p>

<p>We can go further, extending our DSL to actually retrieving the results.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='clojure'><span class='line'><span class="p">(</span><span class="nf">find-psms</span> <span class="p">[</span><span class="nv">&amp;</span> <span class="nv">criteria</span><span class="p">]</span>
</span><span class='line'>  <span class="p">(</span><span class="nf">fetch</span> <span class="nv">PSM-COLLECTION</span> <span class="nv">:where</span> <span class="p">(</span><span class="nb">apply </span><span class="nv">query</span> <span class="nv">criteria</span><span class="p">)))</span>
</span></code></pre></td></tr></table></div></figure>


<p>Here, our application no longer even needs to be aware of which collection we&#8217;re searching.  The code to retrieve our results is now:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='clojure'><span class='line'><span class="p">(</span><span class="nf">find-psms</span> <span class="p">(</span><span class="nf">matching-peptide</span> <span class="s">&quot;GLYQRPHDSTRFK&quot;</span><span class="p">)</span>
</span><span class='line'>           <span class="p">(</span><span class="nf">with-e-value-cutoff</span> <span class="mf">0.001</span><span class="p">)</span>
</span><span class='line'>           <span class="p">(</span><span class="nf">in-region</span> <span class="p">{</span><span class="nv">:chromosome</span> <span class="s">&quot;X&quot;</span>
</span><span class='line'>                       <span class="nv">:strand</span> <span class="s">&quot;+&quot;</span>
</span><span class='line'>                       <span class="nv">:start</span> <span class="mi">12345</span>
</span><span class='line'>                       <span class="nv">:stop</span> <span class="mi">34567</span><span class="p">}))</span>
</span></code></pre></td></tr></table></div></figure>


<p>That&#8217;s almost exactly what the equivalent request would be in plain English.  You don&#8217;t get much simpler.</p>

<h2>Conclusion</h2>

<p>Obviously what I have shown here is pretty basic stuff, and not at all difficult to implement.  There&#8217;s a lot more that the application will have to do, including paging, limiting, sorting, as well as more complicated queries.  However, there&#8217;s not much more functionality that needs to be added that is significantly different from what&#8217;s been shown.  And look what has been gained: an almost-English query language that insulates our application not only from the specific modeling choices we&#8217;ve made, but also from the specific database system we are using.  This last point is particularly nice in my case, as I plan to migrate from MongoDB to a Neo4j graph database in the near future.  Using this DSL internally is going to make that task significantly more straightforward.</p>

<p><strong>Update</strong>: Aaron Crow mentioned this post in his presentation <a href="https://github.com/dirtyvagabond/mongola">Clojure on Mongo: Fun and Easy with CongoMongo</a>, presented at Mongo LA on 19 January 2012.  Thanks, Aaron!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Writing Elegant Clojure Code Using Higher-Order Functions]]></title>
    <link href="http://christophermaier.name/blog/2011/07/07/writing-elegant-clojure-code-using-higher-order-functions/"/>
    <updated>2011-07-07T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2011/07/07/writing-elegant-clojure-code-using-higher-order-functions</id>
    <content type="html"><![CDATA[<p>Back when I first started writing <a href="http://www.clojure.org">Clojure</a> code, I heard lots about the use of <a href="http://en.wikipedia.org/wiki/Higher-order_function"><em>higher-order functions</em></a> (also known as <em>HOFs</em>).  Since functions are first-class language members in Clojure, you can do things like pass them as arguments or return them from function calls.  Any function that accepts or produces another function in this way is a higher-order function.  This allows you to write some very powerful and consise code, because you can capture the general <em>form</em> of a computation, while allowing its specific <em>behavior</em> to be determined at runtime by the user.  You basically say to the function caller &#8220;I&#8217;m going to give you <em>X</em>, but I&#8217;ll make it however you tell me to&#8221;.</p>

<p>So I knew about all this, and could see how powerful a technique it could be, but I didn&#8217;t fully <em>grok</em> the whole concept yet.  Coming from a mainly Java background at the time, I hadn&#8217;t had any experience with first-class functions, and still approached everything from a procedural and object-oriented background.  Using HOFs was a bit of an alien concept.</p>

<p>Clojure is littered with HOFs; if you&#8217;re new to the language, you&#8217;ve already used them, perhaps without realizing it.  The <a href="http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/map"><code>map</code></a> function is probably the archetype of a HOF.  It iterates through a sequence of items, applying a function to each one in turn, and returns a sequence of the results.  It says &#8220;I&#8217;m going to transform each element of this sequence, but I&#8217;ll do it however you tell me to&#8221;.  So, you can pass in the <code>inc</code> function to <code>map</code> to increment each number in a list, like so:</p>

<pre><code>user&gt; (map inc [1 2 3 4 5])
(2 3 4 5 6)
</code></pre>

<p>You can also use an anonymous function (because hey, it&#8217;s still a function), allowing you to, say, multiply each number in a list by five:</p>

<pre><code>user&gt; (map #(* 5 %) [1 2 3 4 5])
(5 10 15 20 25)
</code></pre>

<p>All well and good, but this is all still pretty basic.  As I said, these kinds of functions are all over Clojure, and you quickly figure out how to use them, out of necessity if nothing else; it&#8217;s difficult to do <em>anything</em> in Clojure without them!  Soon I realized that it wasn&#8217;t the function-<em>accepting</em> HOFs that I hadn&#8217;t quite gotten; it was the function-<em>generating</em> HOFs that I didn&#8217;t fully appreciate.  Clojure has several of these functions, too, and mastering them really allows you to create some elegant constructs.  I&#8217;ll mainly talk about <a href="http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/partial"><code>partial</code></a> and <a href="http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/comp"><code>comp</code></a>, but there are also <a href="http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/juxt"><code>juxt</code></a> and <a href="http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/complement"><code>complement</code></a>, and I may have overlooked others.  And of course, you can always make your own.</p>

<p>The simplest of Clojure&#8217;s built-in function generators is probably <code>partial</code>, which lets you &#8220;prime&#8221; an existing function with some number of arguments.  For example, you could make a &#8220;quintupler&#8221; function like we used above, but using <code>partial</code> like this:</p>

<pre><code>user&gt; (def quintuple (partial * 5))
#'user/quintuple
user&gt; (map quintuple [1 2 3 4 5])
(5 10 15 20 25)
</code></pre>

<p>This <code>quintuple</code> function is just the standard multplication function (<code>*</code>), already primed with a first argument of <code>5</code> (it is exactly equivalent to <code>(fn [x] (* 5 x))</code>).  Any other arguments passed into <code>quintuple</code> will also be multiplied together, and then multiplied by 5.  (Though I&#8217;ve broken <code>quintuple</code> out as a separate function here, it is more idiomatic to use it directly, like <code>(map (partial * 5) [1 2 3 4 5])</code>.)</p>

<p>The <code>comp</code> function is a little trickier, but not by much.  Short for &#8220;compose&#8221;, it carries out the functional composition you learned about in high school algebra class (you remember <em>f(g(h(x))</em>, right?).  So basically, <code>(comp f g h)</code> creates a function that will apply the function <code>h</code> to its arguments, then apply <code>g</code> to the result, then apply <code>f</code> to the result of that.  Of course, you can supply as many functions as you like.  In this way, it&#8217;s similar to (but <em>not</em> the same as!) Clojure&#8217;s threading macros (<code>-&gt;</code> and <code>-&gt;&gt;</code>).</p>

<p>Actually, this whole post is basically an excuse to share the fun trick I recently discovered.  Say you need to condense a sequence of pairs into a map.  No problem, right?  We&#8217;ll just use <code>into</code>.</p>

<pre><code>user&gt; (def pairs [[:one 1] [:two 2] [:three 3]])
#'user/pairs
user&gt; (into {} pairs)
{:one 1, :two 2, :three 3}
</code></pre>

<p>Now for a wrinkle: what if a key repeats?</p>

<pre><code>user&gt; (def pairs [[:one 1] [:two 2] [:three 3] [:rest 4] [:rest 5] [:rest 6]])
#'user/pairs
user&gt; (into {} pairs)
{:one 1, :two 2, :three 3, :rest 6}
</code></pre>

<p>That&#8217;s no good; each successive pair with a duplicated key will overwrite the previous values.  What you really want is to create a <em>sequence</em> if there are multiple values, but not if there&#8217;s only one.  HOFs to the rescue!</p>

<pre><code>user&gt; (apply merge-with
             (comp vec flatten vector)
             (map (partial apply hash-map)
                  pairs))
user&gt; {:rest [4 5 6], :three 3, :two 2, :one 1}
</code></pre>

<p>That did it!  The magic happens with <code>(comp vec flatten vector)</code>.  This generates  the function that <code>merge-with</code> will use to combine the pairs together (once we turn them into maps, that is).  If a key is already present, the function gets called with both the existing value and the value to be added.  This can be a bit tricky to grasp at first, so I&#8217;ll walk through what&#8217;s happening step by step.</p>

<p>Keep in mind that our merge function, <code>(comp vec flatten vector)</code> is only called when there is already a value for a given key.  If it&#8217;s the first time we&#8217;re merging a particular key, there will be one value, but for all subsequent times, there will be a vector of values.  We thus have two cases to examine.  Since the <code>comp</code> is just composing <code>vec</code>, <code>flatten</code>, and <code>vector</code>, I&#8217;ll split out each operation to show what happens.</p>

<p>First, we&#8217;ll look at what happens the first time we merge a value.  We&#8217;ll call the pre-existing value <code>:A</code> and the incoming (to-be-merged) value <code>:B</code>; in the end, we&#8217;ll expect to see <code>[:A :B]</code>.</p>

<pre><code>user&gt; (vector :A :B)
[:A :B]
user&gt; (flatten [:A :B])
(:A :B)
user&gt; (vec '(:A :B))
[:A :B]
user&gt;
</code></pre>

<p>Remember, <code>comp</code> passes the result of one computation as the input to the next in the chain.  In this particular scenario, the calls to <code>flatten</code> and <code>vec</code> seem unnecessary; after all, we could have stopped after <code>vector</code> and been done with it.  If you&#8217;re only ever going to merge two values, then yes, you could have just used <code>vector</code>&#8230;  but that&#8217;s not very interesting, is it?  Let&#8217;s continue on with our example and merge in an additional value, <code>:C</code>.  This time we&#8217;re starting with the vector <code>[:A :B]</code>, which will illustrate the second case of behavior.</p>

<pre><code>user&gt; (vector [:A :B] :C)
[[:A :B] :C]
user&gt; (flatten [[:A :B] :C])
(:A :B :C)
user&gt; (vec '(:A :B :C))
[:A :B :C]
user&gt;
</code></pre>

<p>Now the need for <code>flatten</code> is apparent; if we didn&#8217;t use it, we&#8217;d end up with an increasingly nested set of vectors within vectors within vectors.  By flattening, we eliminate the nesting before it has a chance to start.  But <code>flatten</code> gives us a sequence, and we wanted to get a vector back.  No problem; <code>vec</code> to the rescue!  (Strictly speaking, everything could still work fine without <code>vec</code>, so long as you don&#8217;t mind a mixture of vectors and sequences as values in your data structure).</p>

<p>Now we can see that in both cases, we end up with a vector of all the values for a given key being plugged into our growing map.  Of course, the astute reader will recognize the (potential) bug lurking here: what if one of your values is <em>already a vector?</em>  If that&#8217;s the case, this particular implementation will not be very kind to you, since it unmercilessly flattens everything in sight.  You can get around this, though (and I leave that as an exercise for the reader); in this article I&#8217;m focusing on the uses of higher order functions&#8230; that, and the software I wrote this function for never has to deal with vector values, so there :P</p>

<p>So that covers the <code>comp</code>-generated HOF, but there&#8217;s another HOF lurking in there, too: <code>(partial apply hash-map)</code>.  All that does is convert the vector pairs into maps for feeding into <code>merge-with</code> (we have to use <code>apply</code>, because <code>hash-map</code> is not expecting a sequence as input).  I told you: HOFs are <em>everywhere</em> in Clojure.</p>

<p>Now, contrast this to how I would have written this function when I was young and foolish, pre-HOF:</p>

<pre><code>(apply merge-with
       (fn [vals v]
         (if (vector? vals)
           (conj vals v)
           [vals v]))
       (for [[k v] pairs]
        {k v}))
</code></pre>

<p>Quite a bit more verbose and just <em>uglier</em>.  Also note that the <code>(comp vec flatten vector)</code> and <code>(partial apply hash-map)</code> functions are more general and re-usable than their wordier counterparts.</p>

<p>This just shows that you can get the job done in Clojure in any number of ways, but to get really succinct and elegant code, it pays to get familiar with Clojure&#8217;s function-generating functions.</p>

<p>Exploration of <code>juxt</code> (quite handy for destructuring <code>let</code> bindings) and <code>complement</code> (great for use with <code>filter</code>, <code>remove</code>, and other predicate-consuming functions) are left as exercises for the reader.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Github's new "Fork and Edit" Feature is Awesome]]></title>
    <link href="http://christophermaier.name/blog/2011/05/22/fork-and-edit-is-awesome/"/>
    <updated>2011-05-22T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2011/05/22/fork-and-edit-is-awesome</id>
    <content type="html"><![CDATA[<p>Github recently rolled out a new <a href="https://github.com/blog/844-forking-with-the-edit-button">Fork and Edit</a> feature that is pretty awesome.  It basically allows you to create your own fork of any Github project, edit files in a new branch of that fork, and create a pull request back to the original project, all from your web browser, and in about as much time as it&#8217;s taken me to write this sentence.  This really lowers the barrier for contributing code to other projects.</p>

<p>For instance, I recently was working on a Clojure web application at work and generating some HTML with James Reeve&#8217;s excellent <a href="http://www.github.com/weavejester/hiccup">Hiccup</a> library.  I was having some problems with a <code>&lt;canvas&gt;</code> tag being rendered improperly: Hiccup was rendering it as this:</p>

<pre><code>&lt;canvas id='spectrum' /&gt;
</code></pre>

<p>Apparently, canvases are &#8220;container tags&#8221; which need to be explicitly closed with the appropriate tag, like this:</p>

<pre><code>&lt;canvas id='spectrum'&gt;&lt;/canvas&gt;
</code></pre>

<p>The fix in Hiccup is simple enough: add the string &#8220;canvas&#8221; to a private set of container tags.  It is literally adding one string to a data structure.  I made a fix in my own Clojure project so I could keep working (thank you, <a href="http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/in-ns">clojure.core/in-ns</a>!) and made a note to go back to formally and submit the change to Hiccup later.</p>

<p>But then I noticed the &#8220;Fork and edit this file&#8221; button at the top of the page.  &#8220;What the hell?  Let&#8217;s give it a shot&#8221;, I thought as I pressed it.  Instant fork and branch creation, and I&#8217;m in an editable view of the code in question.  I scroll down, add the magical &#8220;canvas&#8221; string at the right place, hit commit, and it&#8217;s all done.  The <a href="https://github.com/weavejester/hiccup/commit/eb49d0ef63d529060863f17d9852f6bcc6f92009">patch</a> was submitted as a pull request and it all took about 2 minutes from start to finish, a large portion of that time taken up by me thinking &#8220;Damn, this is slick!&#8221;.</p>

<p>Now, I wouldn&#8217;t dream of doing this for any kind of involved coding (Emacs is a far better code editor than an HTML text field, thank you very much), but for minor fixes like this, it&#8217;s golden.  Think how great this would be for documentation fixes!</p>

<p>And honestly, &#8220;Fork and Edit&#8221; is really what&#8217;s needed for these kinds of fixes.  If I&#8217;m reading through some code on Github (which I actually kind of prefer, truth be told) and I see some confusing or unclear documentation or some other minor problem, I&#8217;m going to have to be <em>really</em> motivated to create a fork, download that code to my computer, fire up Emacs, make the change, push the code back to my fork, <em>and</em> issue a pull request.  However, if I can make the change right there while I&#8217;m thinking about it, without having to do anything else, then I&#8217;m probably going to do it.</p>

<p>Fork and Edit makes contributing to open source code about as easy as editing Wikipedia, and that&#8217;s a very good thing.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Importance of MongoDB Key Names]]></title>
    <link href="http://christophermaier.name/blog/2011/05/22/MongoDB-key-names/"/>
    <updated>2011-05-22T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2011/05/22/MongoDB-key-names</id>
    <content type="html"><![CDATA[<p>Coming from a relational database background, you might not devote a lot of thought to the names of your columns.  That is to say, while you make an effort to come up with a sensible and descriptive naming scheme for your columns, you probably don&#8217;t think about the amount of <em>space</em> those names take up.  And why should you?  In a relational databases, the column names are stored once, probably in a central metadata table.  Whether you call a column <code>sequence</code> or <code>seq</code> or <code>s</code>, it isn&#8217;t going to make any practical difference in how much space your database uses.</p>

<p>In a schema-free database system like <a href="http://mongodb.org">MongoDB</a>, however, the key names you choose can have a sizable impact on the final size of a collection, depending on the relative size of the keys to the entire document size, as well as the number of documents in the collection.  This is due to the fact that the key names are stored in each document; this is a side effect of being schema-free.</p>

<p>Recently I was experimenting with a database at work with 1.6 billion documents (yes, that&#8217;s with a <em>B</em>).  Each document looks like this:</p>

<div class="highlight"><pre><code class="javascript"><span class="p">{</span><span class="s2">&quot;sequence&quot;</span><span class="o">:</span><span class="s2">&quot;AHAHSPGPGSAVKLPAPHSVGKSALR&quot;</span><span class="p">,</span>
 <span class="s2">&quot;location&quot;</span><span class="o">:</span><span class="p">{</span>
     <span class="s2">&quot;chromosome&quot;</span><span class="o">:</span><span class="s2">&quot;19&quot;</span><span class="p">,</span>
     <span class="s2">&quot;strand&quot;</span><span class="o">:</span><span class="s2">&quot;-&quot;</span><span class="p">,</span>
     <span class="s2">&quot;begin&quot;</span><span class="o">:</span><span class="s2">&quot;51067007&quot;</span><span class="p">,</span>
     <span class="s2">&quot;end&quot;</span><span class="o">:</span><span class="s2">&quot;51067085&quot;</span>
 <span class="p">}}</span>
</code></pre>
</div>


<p>Once all the data were loaded up here&#8217;s what the collection stats looked like:</p>

<pre><code>&gt; db.peptides.stats()
{
    "ns" : "haystack.peptides",
    "count" : 1602177119,
    "size" : 216603098452,
    "avgObjSize" : 135.1929795297495,
    "storageSize" : 243349563712,
    "numExtents" : 144,
    "nindexes" : 1,
    "lastExtentSize" : 2146426864,
    "paddingFactor" : 1,
    "flags" : 1,
    "totalIndexSize" : 66625226448,
    "indexSizes" : {
            "_id_" : 66625226448
    },
    "ok" : 1
}
</code></pre>

<p>Total size is around 243 GB.</p>

<p>Then I did a little experiment and used the smallest keys possible; all my keys are now one letter long.  Now my documents look like this:</p>

<div class="highlight"><pre><code class="javascript"><span class="p">{</span><span class="s2">&quot;s&quot;</span><span class="o">:</span><span class="s2">&quot;AHAHSPGPGSAVKLPAPHSVGKSALR&quot;</span><span class="p">,</span>
 <span class="s2">&quot;l&quot;</span><span class="o">:</span><span class="p">{</span>
     <span class="s2">&quot;c&quot;</span><span class="o">:</span><span class="s2">&quot;19&quot;</span><span class="p">,</span>
     <span class="s2">&quot;s&quot;</span><span class="o">:</span><span class="s2">&quot;-&quot;</span><span class="p">,</span>
     <span class="s2">&quot;b&quot;</span><span class="o">:</span><span class="s2">&quot;51067007&quot;</span><span class="p">,</span>
     <span class="s2">&quot;e&quot;</span><span class="o">:</span><span class="s2">&quot;51067085&quot;</span>
 <span class="p">}}</span>
</code></pre>
</div>


<p>Now, when I reload all the data with this schema, I get these collection stats:</p>

<pre><code>&gt; db.tinypeptides.stats()
{
    "ns" : "haystack.tinypeptides",
    "count" : 1602177119,
    "size" : 155730331732,
    "avgObjSize" : 97.19919844392685,
    "storageSize" : 182977326080,
    "numExtents" : 118,
    "nindexes" : 1,
    "lastExtentSize" : 2146426864,
    "paddingFactor" : 1,
    "flags" : 1,
    "totalIndexSize" : 66625734352,
    "indexSizes" : {
            "_id_" : 66625734352
    },
    "ok" : 1
}
</code></pre>

<p>Final size now is just shy of 183 GB, for a savings of about 60 GB.  Not too bad!</p>

<p>Of course, you shouldn&#8217;t just go changing all your keys for the hell of it.  You&#8217;ll need to consider how you use your data in applications; who knows, maybe constantly translating between space-saving short keys and human-readable long keys will be too expensive or unwieldy for your particular project.  But if it makes sense, using shorter keys can potentially save a lot of space and even increase performance (after all, more of a smaller database can fit into memory).</p>

<p>Granted, my documents are very small to begin with, so the amout of space devoted to the keys is relatively large.  The proportional space savings for collections with larger documents probably will not be as great; it all depends on your data.  In any event, it&#8217;s good to be aware of this aspect of schema-free databases.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Not-So-Private Clojure Functions]]></title>
    <link href="http://christophermaier.name/blog/2011/04/30/not-so-private-clojure-functions/"/>
    <updated>2011-04-30T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2011/04/30/not-so-private-clojure-functions</id>
    <content type="html"><![CDATA[<p>If you&#8217;ve been programming in Clojure for longer than, oh, about 5 minutes, you probably already know how <code>defn</code> creates a publicly accessible function in a namespace, while <code>defn-</code> creates a private one.  If you&#8217;re outside the original namespace and you try to call a private function, you will get the smackdown.</p>

<p>Here&#8217;s a simple demonstration.  We&#8217;ll create two functions, one public and one private, in the <code>user</code> namespace:</p>

<div class="highlight"><pre><code class="clojure"><span class="nv">user&gt;</span> <span class="p">(</span><span class="k">defn </span><span class="nv">hello</span> <span class="p">[]</span>
        <span class="s">&quot;Hello World&quot;</span><span class="p">)</span>
<span class="o">#</span><span class="ss">&#39;user/hello</span>
<span class="nv">user&gt;</span> <span class="p">(</span><span class="nf">hello</span><span class="p">)</span>
<span class="s">&quot;Hello World&quot;</span>
<span class="nv">user&gt;</span> <span class="p">(</span><span class="k">defn- </span><span class="nv">secret</span> <span class="p">[]</span>
        <span class="s">&quot;TOP SECRET&quot;</span><span class="p">)</span>
<span class="o">#</span><span class="ss">&#39;user/secret</span>
<span class="nv">user&gt;</span> <span class="p">(</span><span class="nf">secret</span><span class="p">)</span>
<span class="s">&quot;TOP SECRET&quot;</span>
</code></pre>
</div>


<p>If we switch to the <code>other</code> namespace, though, we can only use the public one:</p>

<div class="highlight"><pre><code class="clojure"><span class="nv">user&gt;</span> <span class="p">(</span><span class="nf">ns</span> <span class="nv">other</span><span class="p">)</span>
<span class="nv">nil</span>
<span class="nv">other&gt;</span> <span class="p">(</span><span class="nf">user/hello</span><span class="p">)</span>
<span class="s">&quot;Hello World&quot;</span>
<span class="nv">other&gt;</span> <span class="p">(</span><span class="nf">user/secret</span><span class="p">)</span>
</code></pre>
</div>


<p>Oops!</p>

<pre><code>var: #'user/secret is not public
  [Thrown class java.lang.IllegalStateException]
</code></pre>

<p>However, you <em>can</em> get around the private flag; all you need to do is refer directly to the function&#8217;s var:</p>

<div class="highlight"><pre><code class="clojure"><span class="nv">other&gt;</span> <span class="p">(</span><span class="o">#</span><span class="ss">&#39;user/secret</span><span class="p">)</span>
<span class="s">&quot;TOP SECRET&quot;</span>
</code></pre>
</div>


<p>You can make it a bit easier by creating a var in your new namespace that points to the private one:</p>

<div class="highlight"><pre><code class="clojure"><span class="nv">other&gt;</span> <span class="p">(</span><span class="k">def </span><span class="nv">secret</span> <span class="o">#</span><span class="ss">&#39;user/secret</span><span class="p">)</span>
<span class="o">#</span><span class="ss">&#39;other/secret</span>
<span class="nv">other&gt;</span> <span class="p">(</span><span class="nf">secret</span><span class="p">)</span>
<span class="s">&quot;TOP SECRET&quot;</span>
</code></pre>
</div>


<p>Now why the hell would you ever want to do this?  In general, you probably shouldn&#8217;t, at least with other people&#8217;s code.  Private functions are private for a reason; they&#8217;re not part of any public API, so they could disappear or change at a moment&#8217;s notice.  However, it can come in handy when you&#8217;re testing your own code.  Often, I&#8217;ll have a few private functions that do something useful within a namespace, but really have no business being used anywhere else.  Sometimes when I&#8217;m testing my public functions, though, I&#8217;ll find myself needing these private functions to either set things up, create test data, or otherwise verify that things turned out alright.</p>

<p>You could also create a separate namespace for all your private helper functions (making them public this time), and then only ever pull that namespace into your main and test namespaces (Fogus and Chouser describe this approach in Section 9.1.2 of <a href="http://joyofclojure.com/">The Joy of Clojure</a>; conveniently this chapter is also available as a <a href="http://www.manning.com/fogus/Sample-Ch9.pdf">free download</a>).  If you&#8217;ve only got a handful of these functions, though, this var shadowing trick is pretty straightforward.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Using `YUI_config` to Set Up Custom YUI Modules]]></title>
    <link href="http://christophermaier.name/blog/2011/03/24/using-yui-config-to-set-up-custom-modules/"/>
    <updated>2011-03-24T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2011/03/24/using-yui-config-to-set-up-custom-modules</id>
    <content type="html"><![CDATA[<p>Once you have been using the <a href="http://developer.yahoo.com/yui/3">YUI</a> JavaScript framework for a little while, you&#8217;ll inevitably need to write your own custom modules.  For me, I started creating lots and lots of <a href="http://developer.yahoo.com/yui/3/widget/">widgets</a> and needed to figure out how to properly modularize everything.  The documentation for creating modules using <a href="http://developer.yahoo.com/yui/3/yui/#yuiadd">YUI.add</a> is good, and it got me up and running quickly.  So far, so good.</p>

<p>Armed with a handful of widgets in proper YUI modules, I needed a way to actually register them with my YUI instances in order to use them.  Again, the docs for the <a href="http://developer.yahoo.com/yui/3/yui/#loader">YUI Loader</a> ably pointed the way, with one tiny exception.</p>

<p>The examples shown all look like this:</p>

<div class="highlight"><pre><code class="javascript"><span class="nx">YUI</span><span class="p">({</span>
   <span class="nx">modules</span><span class="o">:</span>  <span class="p">{</span>
       <span class="s2">&quot;foo_widget&quot;</span><span class="o">:</span> <span class="p">{</span>
           <span class="nx">fullpath</span><span class="o">:</span> <span class="s2">&quot;/js/foo_widget.js&quot;</span><span class="p">,</span>
           <span class="nx">requires</span><span class="o">:</span> <span class="p">[</span><span class="s2">&quot;widget&quot;</span><span class="p">]</span>
           <span class="c1">// ... other configuration ...</span>
       <span class="p">},</span>
       <span class="c1">// ... other modules ...</span>
   <span class="p">}</span>
<span class="p">}).</span><span class="nx">use</span><span class="p">(</span><span class="s2">&quot;foo_widget&quot;</span><span class="p">,</span> <span class="kd">function</span><span class="p">(</span><span class="nx">Y</span><span class="p">)</span> <span class="p">{</span>
    <span class="c1">// Do stuff...</span>
<span class="p">});</span>
</code></pre>
</div>


<p>Of course, this works, but it becomes a pain to repeat all my custom module information on each page where they&#8217;re used (for each YUI instance, really).  The documentation doesn&#8217;t readily show (at least, I wasn&#8217;t able to find it) how you can set all this information once for your entire application.</p>

<p>Digging around on the forums, however, turned up the <code>YUI_config</code> object.  Basically, this is the same map that you would otherwise pass to the <code>YUI</code> object when you want to create a new instance.  Sticking it in <code>YUI_config</code> makes it automatically available whenever you create a new instance.  What I&#8217;ve done is create a file that defines a <code>YUI_config</code> object for my application (loading it up with all my module definition information), and I include that in my pages.  It&#8217;s dead simple:</p>

<div class="highlight"><pre><code class="javascript"><span class="nx">YUI_config</span> <span class="o">=</span> <span class="p">{</span>
    <span class="nx">modules</span><span class="o">:</span> <span class="p">{</span>
        <span class="s2">&quot;foo_widget&quot;</span><span class="o">:</span> <span class="p">{</span>
            <span class="nx">fullpath</span><span class="o">:</span> <span class="s2">&quot;/js/foo_widget.js&quot;</span><span class="p">,</span>
            <span class="nx">requires</span><span class="o">:</span> <span class="p">[</span><span class="s2">&quot;widget&quot;</span><span class="p">]</span>
        <span class="p">},</span>
        <span class="c1">// ... etc.</span>
    <span class="p">},</span>
<span class="p">};</span>
</code></pre>
</div>


<p>Now, I can reference <code>foo_widget</code> just like I would any other YUI module:</p>

<div class="highlight"><pre><code class="javascript"><span class="nx">YUI</span><span class="p">().</span><span class="nx">use</span><span class="p">(</span><span class="s2">&quot;foo_widget&quot;</span><span class="p">,</span>
    <span class="kd">function</span><span class="p">(</span><span class="nx">Y</span><span class="p">){</span>
        <span class="c1">// do interesting things with foo_widget here...</span>
    <span class="p">});</span>
</code></pre>
</div>


<p>Problem solved.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[MongoDB Query Tricks, or "Why Don't MongoDB Doesn't Not have $and?"]]></title>
    <link href="http://christophermaier.name/blog/2010/12/21/mongodb-query-tricks/"/>
    <updated>2010-12-21T00:00:00-05:00</updated>
    <id>http://christophermaier.name/blog/2010/12/21/mongodb-query-tricks</id>
    <content type="html"><![CDATA[<p>Pardon the atrocious grammar; there is a point!</p>

<p>I recently had a tricky time formulating a particular query in MongoDB.  As you probably know, MongoDB has a number of <a href="http://www.mongodb.org/display/DOCS/Advanced+Queries">query operators</a> to use.  It&#8217;s got stuff like <code>$in</code>, <code>$nin</code>, <code>$or</code>, and others, but no <code>$and</code>.  Normally, you don&#8217;t need something like <code>$and</code>, since the capability is there implicitly; you just list off all your conditions on your different document fields, and MongoDB finds all the documents that satisfy them all.  However, there <em>is</em> a situation that I&#8217;ve come across lately where something like <code>$and</code> would be very helpful; declaring multiple conditions <em>on a single field</em>.</p>

<p>Let&#8217;s set up some test data to illustrate a scenario:</p>

<pre><code>db.people.save({name: "Xavier", friends: ["Bob","Fred","Sam"]});
db.people.save({name: "Yorick", friends: ["Elmer","Alice"]});
db.people.save({name: "Zelda", friends: ["David","Erica","Walt"]});
</code></pre>

<p>Say I have one group of interesting people:</p>

<pre><code>["Alice","Bob","Charlie"]
</code></pre>

<p>And then I have <em>another</em> group of interesting people:</p>

<pre><code>["David","Erica","Fred"]
</code></pre>

<p>Now I want to find everybody in my database that is friends with at least one person from <em>each</em> of these groups.  That is to say, I would want to find Xavier, but not Yorick or Zelda.  I want to specify two conditions on my &#8220;friends&#8221; field.  (This isn&#8217;t the real situation I was dealing with, but I&#8217;ll spare you the scientific background.)</p>

<p>You might think, &#8220;That&#8217;s easy, you just use the <code>$in</code> query operator!&#8221;.  Well, that&#8217;s what I thought.  There&#8217;s a problem with that, though.  Conceptually, you want a query like this, to take advantage of MongoDB&#8217;s implicit <code>$and</code> for conditions:</p>

<pre><code>db.people.find({
    friends: {$in: ["Alice","Bob","Charlie"]},
    friends: {$in: ["David","Erica","Fred"]}
});
</code></pre>

<p>However, this is going to find Xavier and Zelda!</p>

<pre><code>{ "_id" : ObjectId("4d11deb1a95769443d8dd7c4"),
  "name" : "Xavier",
  "friends" : [ "Bob", "Fred", "Sam" ] }
{ "_id" : ObjectId("4d11deb2a95769443d8dd7c6"),
  "name" : "Zelda",
  "friends" : [ "David", "Erica", "Walt" ] }
</code></pre>

<p>You&#8217;ve duplicated your keys!  The implicit <code>$and</code> only really works on different fields; you can&#8217;t combine conditions on the same field this way.  Here your&#8217;re actually only looking for friends of David, Erica, or Fred instead of finding people that could bridge these two cliques.  It seems that the last condition declared &#8220;wins&#8221;.  Note that we didn&#8217;t find Yorick in that last query; he&#8217;s not friends with David, Erica, or Fred.</p>

<p>My next thought was to try <code>$all</code>:</p>

<pre><code>db.people.find({
    friends: {
        $all: [
            {$in: ["Alice","Bob","Charlie"]},
            {$in: ["David","Erica","Fred"]}
        ]
    }
});
</code></pre>

<p>That doesn&#8217;t work either; it looks like <code>$all</code> only accepts a list of values, not additional conditions.</p>

<p>So how do you ask this query?  You need to be a little more tricky in your formulation.  After a quick fling with a truth table, here&#8217;s my solution:</p>

<pre><code>db.people.find({
    $nor: [
        {friends: {$not: {$in: ["Alice","Bob","Charlie"]}}},
        {friends: {$not: {$in: ["David","Erica","Fred"]}}}
    ]
});
</code></pre>

<p>That gives you what you want</p>

<pre><code>{ "_id" : ObjectId("4d11deb1a95769443d8dd7c4"),
  "name" : "Xavier",
  "friends" : [ "Bob", "Fred", "Sam" ] }
</code></pre>

<p>It looks confusing at first, but just step through it.  You are saying, &#8220;find everybody that <strong>isn&#8217;t not</strong> friends with Alice, Bob, or Charlie, <em>and</em> <strong>isn&#8217;t not</strong> friends with David, Erica, or Fred&#8221;.  The <code>$nor</code> gives you the <code>$and</code> capabilities (albeit negated), and the <code>$not</code>s reverse the meanings of your tests to be compatible with <code>$nor</code>.  It would be easier if MongoDB actually had an <code>$and</code> operator, but this will do in a pinch.</p>

<p>Since discovering this trick, I&#8217;ve actually had to use it a number of times, particularly when dealing with array fields that contain objects instead of plain values.</p>

<p>Double negatives can be handy, no matter what your middle school English teacher says.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Using MapReduce in Congomongo Now]]></title>
    <link href="http://christophermaier.name/blog/2010/10/17/using-mapreduce-in-congomongo-now/"/>
    <updated>2010-10-17T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2010/10/17/using-mapreduce-in-congomongo-now</id>
    <content type="html"><![CDATA[<p>My patch to Congomongo that adds support for running MapReduce jobs in MongoDB <a href="http://christophermaier.name/2010/08/23/patches-accepted.html">was accepted</a>, but that code hasn&#8217;t yet been pushed out to the official Congomongo SNAPSHOT jar in Clojars.</p>

<p>Until that happens, if you&#8217;d like to use the new MapReduce code, you can use <a href="http://clojars.org/org.clojars.christophermaier/congomongo">my version</a>.</p>

<p>If you&#8217;re using Leiningen, add this to your <code>project.clj</code>:</p>

<pre><code>[org.clojars.christophermaier/congomongo "0.1.3-SNAPSHOT"]
</code></pre>

<p>And here&#8217;s the dependency information for Maven:</p>

<pre><code>&lt;dependency&gt;
  &lt;groupId&gt;org.clojars.christophermaier&lt;/groupId&gt;
  &lt;artifactId&gt;congomongo&lt;/artifactId&gt;
  &lt;version&gt;0.1.3-SNAPSHOT&lt;/version&gt;
&lt;/dependency&gt;
</code></pre>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Using Idiomatic Clojure, Part 1 - comp]]></title>
    <link href="http://christophermaier.name/blog/2010/10/12/using-idiomatic-clojure-part-1/"/>
    <updated>2010-10-12T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2010/10/12/using-idiomatic-clojure-part-1</id>
    <content type="html"><![CDATA[<p>As I read the Clojure code of others, I come across better ways to write my own code.  Today&#8217;s example comes from
<a href="http://manning.com/fogus/">The Joy of Clojure</a> by Michael Fogus and Chris Houser.</p>

<p>I often find myself writing anonymous functions along the lines of</p>

<div class="highlight"><pre><code class="clojure"><span class="o">#</span><span class="p">(</span><span class="nv">not</span> <span class="p">(</span><span class="nb">vector? </span><span class="nv">%</span><span class="p">))</span>
</code></pre>
</div>


<p>to act as filters in various places (<code>filter</code>, <code>for</code>, <code>take-while</code>, etc.).  I always thought it looked a bit gnarly like that.  Fortunately, there is a better way, using the <code>comp</code> function.</p>

<p>According to the <a href="http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/comp">documentation string</a>, <code>comp</code> takes a number of functions and returns a new function that is the composition of all of them.  I&#8217;ve used <code>comp</code> a few other places before, but for some reason, it didn&#8217;t &#8220;click&#8221; that I could use it in this situation, too.  With it, the above code transforms into the much cleaner-looking</p>

<div class="highlight"><pre><code class="clojure"><span class="p">(</span><span class="nb">comp </span><span class="nv">not</span> <span class="nv">vector?</span><span class="p">)</span>
</code></pre>
</div>


<p>Looks much better without the anonymous function trappings, yes?</p>

<p><em>Update</em>: I just came back to this post after a long time&#8230; Now I probably wouldn&#8217;t even use <code>comp</code> here, opting instead for</p>

<div class="highlight"><pre><code class="clojure"><span class="p">(</span><span class="nb">complement </span><span class="nv">vector?</span><span class="p">)</span>
</code></pre>
</div>


<p>It keeps getting shorter!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[My iPhone Plays MP3s Again]]></title>
    <link href="http://christophermaier.name/blog/2010/09/24/iPhone-plays-MP3s-again/"/>
    <updated>2010-09-24T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2010/09/24/iPhone-plays-MP3s-again</id>
    <content type="html"><![CDATA[<p>I recently upgraded my old iPhone 3G to iOS 4.1; I&#8217;d heard the horror stories about the abysmal performance that iOS 4.0 had with the older phone, so I waited until Apple had the fix.  Everything went well with the upgrade; I think my phone might actually be a little <em>faster</em>.  But then I tried playing some music.</p>

<p>A large portion of my library just wouldn&#8217;t play.  I&#8217;d select a song and the iPod app would try to play it for a second or two, and then it would move on to the next song.  Sometimes there would be a string of songs that it would go through like this before actually playing one.  It was maddening, because a large portion of my library seemed unplayable.</p>

<p>After a bit of testing, it appeared that only my MP3 files were not playing.  Strangely enough, my MP3 podcasts would play fine; only the music files were affected.  Songs encoded with AAC played fine.  The weird thing is that only the songs that are MP3 <em>on my computer</em> would not play, since on my iPhone they are all AACs (I have everything converted to 128kbps AAC as it loads onto the phone to squeeze every last byte out of the hard drive).  I know, none of this really makes any sense.</p>

<p>Anyway, the fix appears to be removing all the music from the iPhone and then re-loading it.  It takes a while, but I can finally listen to my music again.  If you&#8217;re having the same problem, give this a try.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Host-Only Networking with VirtualBox]]></title>
    <link href="http://christophermaier.name/blog/2010/09/01/host-only-networking-with-virtualbox/"/>
    <updated>2010-09-01T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2010/09/01/host-only-networking-with-virtualbox</id>
    <content type="html"><![CDATA[<p>I&#8217;ve been playing around with <a href="http://www.virtualbox.org">VirtualBox</a> lately, and it&#8217;s pretty nifty.  Initially I just used it to set up an Ubuntu machine so I could easily run <a href="http://www.gnucash.org">GnuCash</a> on my MacBook, but lately I&#8217;ve decided to start using it at work, too.  I&#8217;ve got lots of plans: figure out how to use <a href="http://wiki.opscode.com/display/chef/Home">Chef</a>, set up my own <a href="http://hudson-ci.org/">Hudson</a> server, get a <a href="http://www.mongodb.org">MongoDB</a> cluster running, etc.  I&#8217;d like all of these machines to be able to access the internet for downloading packages and the like, and I also want to be able to access them easily from my host machine.  I <em>don&#8217;t</em> want these servers to be accessible from anywhere else on the network, however; these are just for me to experiment with for the time being.  After banging my head on this for a little while, I think I&#8217;ve finally hit upon a (the?) solution.</p>

<p>Just for completeness&#8217; sake, my host is Mac OS X and my guests are all Ubuntu boxes, so any system-specific instructions here are going to have a UNIX orientation.  You have been warned.</p>

<h2>Step 1: Access the Internet from the Guest</h2>

<p>Set up one of your guest machine&#8217;s networking adapters to NAT.  This is really easy, since it&#8217;s the default.</p>

<p><img src="http://christophermaier.name/images/posts/2010-08-31-networking-with-virtualbox/nat-adapter.png" title="[Adding a NAT Adapter]" ></p>

<p>This will allow the guest system to access the broader internet through your host&#8217;s connection.  You&#8217;ll be able to download packages, check email; whatever.  Nobody outside sees anything of your guest system; as far as they&#8217;re concerned, it doesn&#8217;t exist.  However, you cannot access any guest resources from your host machine, nor can any guest machines access each other.  Yet.</p>

<h2>Step 2: Access the Guest from the Host (and Other Guests)</h2>

<p>We&#8217;ll need to add another network adapter to your guest machine, but this time, it&#8217;ll be a <em>Host-Only Adapter</em>.  By using this type of adapter, you&#8217;ll be able to access a private, virtual network consisting solely of your host and any guests.  Any of the member machines can access each other, but nothing outside of this self-contained &#8220;network in a box&#8221; can get in.</p>

<p><img src="http://christophermaier.name/images/posts/2010-08-31-networking-with-virtualbox/host-only-adapter.png" title="[Adding a Host-Only Adapter]" ></p>

<p>VirtualBox can create several of these virtual host-only networks (it&#8217;s what the &#8220;Name&#8221; field refers to in the Network Adapter Setup screen above).  You can configure these in the VirtualBox Preferences; there should be one already created for you called <em>vboxnet0 network</em>.</p>

<p><img src="http://christophermaier.name/images/posts/2010-08-31-networking-with-virtualbox/network-preferences.png" title="[VirtualBox Network Preferences]" ></p>

<p>Click on the &#8220;Edit&#8221; button for your Host-only network; you&#8217;ll see a dialog like this:</p>

<p><img src="http://christophermaier.name/images/posts/2010-08-31-networking-with-virtualbox/vboxnet0-adapter.png" title="[vboxnet0 Adapter]" ></p>

<p>Take note of the adapter&#8217;s IP address (<code>192.168.56.1</code> here); that&#8217;s the address at which your guests can access the host.</p>

<p><img src="http://christophermaier.name/images/posts/2010-08-31-networking-with-virtualbox/vboxnet0-dhcp.png" title="[vboxnet0 DHCP]" ></p>

<p>By default, there&#8217;s a DHCP server set up on the network.  Since we&#8217;re going to be assigning static IP addresses, we don&#8217;t really need this, so you can uncheck the &#8220;Enable Server&#8221; box on the DHCP configuration panel.</p>

<h2>Step 3: Configure Guests</h2>

<p>We need each of the guests to have a static IP address on the host-only network.  Log in to your Ubuntu guest and issue the following command:</p>

<pre><code>ifconfig eth1 192.168.56.101 netmask 255.255.255.0 up
</code></pre>

<p>(Use whatever IP on your host-only network you like, of course.)  This binds a network interface to one of the IP addresses on your host-only network (eth0 was bound to the NAT adapter).  Now you should be able to SSH into your guest from your host (for example) using this IP address.</p>

<p>This is just temporary, however; once you reboot, this configuration will disappear.  To make it permanent, add this to your <code>/etc/network/interfaces</code> file (as <code>root</code>):</p>

<pre><code># The host-only network interface
auto eth1
iface eth1 inet static
address 192.168.56.101
netmask 255.255.255.0
network 192.168.56.0
broadcast 192.168.56.255
</code></pre>

<p>Reboot, and this interface should now show up when you type <code>ifconfig</code>.</p>

<h2>Step 4: Make Networking Easier with <code>/etc/hosts</code></h2>

<p>Remembering IP addresses is a pain; we&#8217;d much rather use machine names.  Fortunately, we don&#8217;t need to bother with a DNS server, since <code>/etc/hosts</code> makes this trivial.  Just edit the file (as <code>root</code>) adding lines like the following:</p>

<pre><code>192.168.56.101    myserver1
192.168.56.102    myserver2
</code></pre>

<p>&#8230; and so on.  You can do this on the host as well as on the guests.  This makes it really simple to access any machine on the host-only network.  Now you can just do something like <code>ssh myserver1</code> instead of <code>ssh 192.168.56.101</code>.</p>

<p>And that ought to do it.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Patches Accepted!]]></title>
    <link href="http://christophermaier.name/blog/2010/08/23/patches-accepted/"/>
    <updated>2010-08-23T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2010/08/23/patches-accepted</id>
    <content type="html"><![CDATA[<p>I just heard that <a href="http://github.com/somnium/congomongo/commit/df433fc11ab76c48dcfe8fa77c4bf19227161a92">my map-reduce patch</a> and my <a href="http://github.com/somnium/congomongo/commit/c1e1417cb729eaa8e5642127e1682883e7d5968e">my GridFS InputStream patch</a> for Congomongo were both committed to the main Congomongo repository!  This marks my first official open source code contributions, and hopeuflly not the last.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Path to MapReduce with Congomongo]]></title>
    <link href="http://christophermaier.name/blog/2010/08/21/the-path-to-map-reduce-with-congomongo/"/>
    <updated>2010-08-21T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2010/08/21/the-path-to-map-reduce-with-congomongo</id>
    <content type="html"><![CDATA[<p>I&#8217;ve recently started a Clojure / <a href="http://www.mongodb.org">MongoDB</a> project at work to help us with our proteogenomic annotation work.  Naturally, I&#8217;m using <a href="http://www.github.com/somnium/congomongo">Congomongo</a> to interact with the database.  It&#8217;s a great wrapper for the <a href="http://github.com/mongodb/mongo-java-driver">MongoDB Java driver</a>, written in a very nice functional style.</p>

<p>Lately I&#8217;ve been looking into the <a href="http://www.mongodb.org/display/DOCS/MapReduce">map-reduce</a> capabilities of MongoDB and have been trying to figure out how to make it work from Clojure.  Looking at the Congomongo API, I came across the <a href="http://github.com/somnium/congomongo/blob/6fc8345a35fa1aa1ba27efa76a4363265b67cad2/src/somnium/congomongo.clj#L316">server-eval</a> function, which looked like a promising place to start.</p>

<p>I decided to kick the tires a bit:</p>

<div class="highlight"><pre><code class="clojure"><span class="nv">user&gt;</span> <span class="p">(</span><span class="nf">use</span> <span class="ss">&#39;somnium</span><span class="o">.</span><span class="nv">congomongo</span><span class="p">)</span>
<span class="nv">nil</span>
<span class="nv">user&gt;</span> <span class="p">(</span><span class="nf">server-eval</span> <span class="s">&quot;function(){return 3+3}&quot;</span><span class="p">)</span>
<span class="mf">6.0</span>
</code></pre>
</div>


<p>So far, so good.  <code>server-eval</code> takes a string of JavaScript code defining a function with no arguments.  This gets sent over to MongoDB, where it gets evaluated and run.</p>

<p>Under the hood, Congomongo is passing off to the MongoDB Java driver&#8217;s <a href="http://github.com/mongodb/mongo-java-driver/blob/r2.0/src/main/com/mongodb/DB.java#L145">com.mongodb.DB.doEval</a> method, which effectively runs this command (as typed into the MongoDB JavaScript console):</p>

<div class="highlight"><pre><code class="javascript"><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">$cmd</span><span class="p">.</span><span class="nx">findOne</span><span class="p">({</span><span class="nx">$eval</span><span class="o">:</span><span class="s2">&quot;function(){return 3+3}&quot;</span><span class="p">})</span>
<span class="p">{</span> <span class="s2">&quot;retval&quot;</span> <span class="o">:</span> <span class="mi">6</span><span class="p">,</span> <span class="s2">&quot;ok&quot;</span> <span class="o">:</span> <span class="mi">1</span> <span class="p">}</span>
</code></pre>
</div>


<p>It&#8217;s calling the special <a href="http://www.mongodb.org/display/DOCS/Server-side+Code+Execution">eval</a> command in MongoDB and passing the result back.  Check out the <a href="http://www.mongodb.org/display/DOCS/Commands">MongoDB Command Documentation</a> as well as the <a href="http://www.mongodb.org/display/DOCS/List+of+Database+Commands">List of Database Commands</a> for more on how this stuff works out.</p>

<p>That&#8217;s all well and good, but it doesn&#8217;t actually help for kicking off a map-reduce job from Clojure.  As the MongoDB documentation <a href="http://www.mongodb.org/display/DOCS/Server-side+Code+Execution#Server-sideCodeExecution-Using%7B%7Bdb.eval%28%29%7D%7D">says</a>:</p>

<blockquote><p>Use map/reduce instead of db.eval() for long running jobs. db.eval blocks other operations!</p></blockquote>

<p>That&#8217;s a bummer.  The only facility Congomongo currently provides for executing code server-side is the aforementioned <code>server-eval</code> function, which only uses the MongoDB <code>eval</code> command; <code>mapReduce</code> is a separate command.  It&#8217;s actually pretty straightforward to add support for map-reduce in Congomongo, though.  Though we could easily use <code>com.mongodb.DB.doEval</code> to perform our map-reduce job, the Java driver helpfully provides <a href="http://github.com/mongodb/mongo-java-driver/blob/r2.0/src/main/com/mongodb/DBCollection.java#L613">com.mongodb.DBCollection.mapReduce</a>, which provides a little bit of sugar for such things.  Studying the code for some other Congomongo functions leads to this solution:</p>

<p><a href="http://github.com/christophermaier/congomongo/commit/df433fc11ab76c48dcfe8fa77c4bf19227161a92">My Congomongo fork, now with map-reduce!</a></p>

<p>The nice thing about this function is that it fully exposes all the capabilities of the native MongoDB map-reduce framework.  Want to add a finalize function?  No problem!  Want sorted or limited query results?  Done!  Want results or just the collection?  You got it.  There&#8217;s lots of documentation for how it all works; the test cases will help, too.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Easy Clojure API Search with Google Chrome]]></title>
    <link href="http://christophermaier.name/blog/2010/08/08/easy-clojure-api-search/"/>
    <updated>2010-08-08T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2010/08/08/easy-clojure-api-search</id>
    <content type="html"><![CDATA[<p>Do you need an easy way to search <a href="http://www.clojure.org">Clojure</a>&#8217;s API?  Do you use Google&#8217;s <a href="http://www.google.com/chrome/">Chrome</a> browser?  If so, you&#8217;re in luck; it&#8217;s incredibly easy to add a custom engine to Chrome to search the API.  Here&#8217;s how.</p>

<h2>Setting Up a Custom Search Engine</h2>

<p>First, open your preferences and hit the &#8220;Manage&#8221; button down by &#8220;Default Search&#8221;.</p>

<p><img src="http://christophermaier.name/images/posts/2010-08-08-easy-clojure-api-search/chrome-preferences.png" title="[Google Chrome Preferences]" ></p>

<p>That will bring up a Search Engines box.  This is how Chrome knows to search, say, Google for whatever you type in the Address Bar (or <em>OmniBox</em>, as it&#8217;s also known).</p>

<p><img src="http://christophermaier.name/images/posts/2010-08-08-easy-clojure-api-search/chrome-search-engines.png" title="[Google Chrome Search Engines]" ></p>

<p>Hit the &#8220;+&#8221; button to set up a new search engine.  The resulting dialog asks you for a &#8220;Name&#8221;, a &#8220;Keyword&#8221;, and a &#8220;URL&#8221;.  Name and Keyword can be whatever you want.  You should probably keep Keyword short and pithy, however, as this is what you&#8217;ll be typing in the Address Bar to tell Chrome which Search Engine to use (I chose &#8220;clj&#8221;).  Finally, the URL should be a parameterized version of the site you want to search.  For example, when you search Google for &#8220;cheese&#8221;, the URL you end up at is this: <a href="http://www.google.com/search?q=cheese">http://www.google.com/search?q=cheese</a>.  To parameterize this for Chrome, just remove the search term and replace it with &#8220;%s&#8221; (without quotes).  Thus, the Google search URL would become <code>http://www.google.com/search?q=%s</code>.  Take a look at the other Search Engines you have for more examples.</p>

<p>The <a href="http://richhickety.github.com/clojure/clojure.core-api.html">Clojure API</a> isn&#8217;t a search engine (it&#8217;s just a big HTML page), but that doesn&#8217;t mean we&#8217;re out of luck.  Documentation entries for each function are accessible through HTML anchors; we&#8217;ll simply parameterize the anchors!  Here is the parameterized URL you&#8217;ll need to search the core Clojure API with Chrome:</p>

<pre><code>http://richhickey.github.com/clojure/clojure.core-api.html#clojure.core/%s
</code></pre>

<p><img src="http://christophermaier.name/images/posts/2010-08-08-easy-clojure-api-search/add-clojure-search-engine.png" title="[Adding a Clojure API Search Engine]" ></p>

<p>When your new Search Engine is all set up, hit &#8220;OK&#8221;.</p>

<h2>Using a Custom Search Engine</h2>

<p>Now for the fun.  Go to the Search Bar and type your Search Engine Keyword &#8220;clj&#8221;, a space, and the Clojure function you need to know about.  Once you&#8217;ve hit the space after the keyword, Chrome will expand it to the name of your Search Engine, indicating that you&#8217;re no longer using your default Search Engine.  Hit Enter and you&#8217;re taken directly to the API docs for your function.</p>

<p><img src="http://christophermaier.name/images/posts/2010-08-08-easy-clojure-api-search/search-clojure-api.png" title="[Search the Clojure API]" ></p>

<p>Neat, eh?</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Comparing Settings From Different PostgreSQL Databases]]></title>
    <link href="http://christophermaier.name/blog/2010/04/27/compare-postgresql-configurations/"/>
    <updated>2010-04-27T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2010/04/27/compare-postgresql-configurations</id>
    <content type="html"><![CDATA[<p>I&#8217;m in the process of migrating from an older version of PostgreSQL to a newer version.  I&#8217;d like to see what the differences are between the configuration (<code>${PG_DATA}/postgresql.conf</code>) of both servers.  I couldn&#8217;t find an easy, ready-made solution, so I hacked up one using plain old SQL, which turns out to be particularly well suited to comparing sets of data :)</p>

<p>First, get the settings from the old server.  We&#8217;ll use <a href="http://www.postgresql.org/docs/current/interactive/app-psql.html"><code>psql</code></a> to execute the <a href="http://www.postgresql.org/docs/current/interactive/sql-show.html"><code>SHOW ALL</code></a> query and pipe the result (stripped of all extraneous formatting) to the file <code>old_settings.txt</code>.  (I&#8217;m using the long versions of the command flags, as well as adding in lots of newlines, to aid in readability and comprehensibility.)</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
</pre></td><td class='code'><pre><code class='bash'><span class='line'>psql --username postgres <span class="se">\</span>
</span><span class='line'>     --dbname postgres <span class="se">\</span>
</span><span class='line'>     --host OLD_SERVER_ADDRESS <span class="se">\</span>
</span><span class='line'>     --port OLD_SERVER_PORT <span class="se">\</span>
</span><span class='line'>     --output old_settings.txt <span class="se">\</span>
</span><span class='line'>     --no-align <span class="se">\</span>
</span><span class='line'>     --quiet <span class="se">\</span>
</span><span class='line'>     --tuples-only <span class="se">\</span>
</span><span class='line'>     --command <span class="s1">&#39;show all&#39;</span>
</span></code></pre></td></tr></table></div></figure>


<p>Now, we&#8217;ll need the settings from the new server.  We use the same trick, but pipe the output to the <code>new_settings.txt</code> file, instead.</p>

<div class="highlight"><pre><code class="bash">psql --username postgres <span class="se">\</span>
     --dbname postgres <span class="se">\</span>
     --host NEW_SERVER_ADDRESS <span class="se">\</span>
     --port NEW_SERVER_PORT <span class="se">\</span>
     --output new_settings.txt <span class="se">\</span>
     --no-align <span class="se">\</span>
     --quiet <span class="se">\</span>
     --tuples-only <span class="se">\</span>
     --command <span class="s1">&#39;show all&#39;</span>
</code></pre>
</div>


<p>So now we have the data in a format that is easily loaded into a PostgreSQL database!  On some other database, we create some simple tables to hold the information; their format is that of the output of the <code>SHOW ALL</code> command.</p>

<div class="highlight"><pre><code class="sql"><span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">old_server</span><span class="p">(</span><span class="k">parameter</span> <span class="nb">TEXT</span><span class="p">,</span> <span class="n">value</span> <span class="nb">TEXT</span><span class="p">,</span> <span class="n">description</span> <span class="nb">TEXT</span><span class="p">);</span>
<span class="k">CREATE</span> <span class="k">TABLE</span> <span class="n">new_server</span><span class="p">(</span><span class="k">parameter</span> <span class="nb">TEXT</span><span class="p">,</span> <span class="n">value</span> <span class="nb">TEXT</span><span class="p">,</span> <span class="n">description</span> <span class="nb">TEXT</span><span class="p">);</span>
</code></pre>
</div>


<p>Now copy the information into the tables using the <code>\copy</code> command:</p>

<div class="highlight"><pre><code class="sql"><span class="err">\</span><span class="k">copy</span> <span class="n">old_server</span> <span class="k">from</span> <span class="p">.</span><span class="o">/</span><span class="n">old_settings</span><span class="p">.</span><span class="n">txt</span> <span class="k">delimiter</span> <span class="k">as</span> <span class="s1">&#39;|&#39;</span>
<span class="err">\</span><span class="k">copy</span> <span class="n">new_server</span> <span class="k">from</span> <span class="p">.</span><span class="o">/</span><span class="n">new_settings</span><span class="p">.</span><span class="n">txt</span> <span class="k">delimiter</span> <span class="k">as</span> <span class="s1">&#39;|&#39;</span>
</code></pre>
</div>


<p>We&#8217;ll create a view to massage this data into a nice report in order to more easily see what&#8217;s different:</p>

<div class="highlight"><pre><code class="sql"><span class="k">CREATE</span> <span class="k">VIEW</span> <span class="n">configurations</span> <span class="k">AS</span>
<span class="k">SELECT</span>
    <span class="n">newer</span><span class="p">.</span><span class="k">parameter</span><span class="p">,</span>
    <span class="n">older</span><span class="p">.</span><span class="n">value</span> <span class="k">as</span> <span class="n">original_value</span><span class="p">,</span>
    <span class="n">newer</span><span class="p">.</span><span class="n">value</span> <span class="k">AS</span> <span class="n">current_value</span><span class="p">,</span>
    <span class="p">(</span><span class="n">older</span><span class="p">.</span><span class="n">value</span> <span class="o">!=</span> <span class="n">newer</span><span class="p">.</span><span class="n">value</span><span class="p">)</span> <span class="k">AS</span> <span class="ss">&quot;different?&quot;</span><span class="p">,</span>
    <span class="p">(</span><span class="n">newer</span><span class="p">.</span><span class="k">parameter</span> <span class="k">IS</span> <span class="k">NULL</span><span class="p">)</span> <span class="k">AS</span> <span class="ss">&quot;removed?&quot;</span><span class="p">,</span>
    <span class="p">(</span><span class="n">older</span><span class="p">.</span><span class="k">parameter</span> <span class="k">IS</span> <span class="k">NULL</span><span class="p">)</span> <span class="k">AS</span> <span class="ss">&quot;new?&quot;</span>
<span class="k">FROM</span> <span class="n">new_server</span> <span class="k">AS</span> <span class="n">newer</span>
<span class="k">LEFT</span> <span class="k">JOIN</span> <span class="n">old_server</span> <span class="k">AS</span> <span class="n">older</span>  <span class="c1">-- there might be some parameters that are no longer there</span>
    <span class="k">ON</span> <span class="n">newer</span><span class="p">.</span><span class="k">parameter</span> <span class="o">=</span> <span class="n">older</span><span class="p">.</span><span class="k">parameter</span>
<span class="p">;</span>
</code></pre>
</div>


<p>Finally, we can execute some simple queries on this view to show us what&#8217;s going on:</p>

<div class="highlight"><pre><code class="sql"><span class="c1">-- What parameters are different?</span>
<span class="k">SELECT</span> <span class="k">parameter</span><span class="p">,</span> <span class="n">original_value</span><span class="p">,</span> <span class="n">current_value</span> <span class="k">FROM</span> <span class="n">configurations</span> <span class="k">WHERE</span> <span class="ss">&quot;different?&quot;</span><span class="p">;</span>

<span class="c1">-- What are the values of the parameters that are not present in the original configuration?</span>
<span class="k">SELECT</span> <span class="k">parameter</span><span class="p">,</span> <span class="n">current_value</span> <span class="k">FROM</span> <span class="n">configurations</span> <span class="k">WHERE</span> <span class="ss">&quot;new?&quot;</span><span class="p">;</span>

<span class="c1">-- What are the values of the parameters that have been removed since the original database version?</span>
<span class="k">SELECT</span> <span class="k">parameter</span><span class="p">,</span> <span class="n">original_value</span> <span class="k">FROM</span> <span class="n">configurations</span> <span class="k">WHERE</span> <span class="ss">&quot;removed?&quot;</span><span class="p">;</span>
</code></pre>
</div>


<p>Problem solved!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Blogging on Github with Jekyll]]></title>
    <link href="http://christophermaier.name/blog/2010/04/25/blog-with-jekyll/"/>
    <updated>2010-04-25T00:00:00-04:00</updated>
    <id>http://christophermaier.name/blog/2010/04/25/blog-with-jekyll</id>
    <content type="html"><![CDATA[<p>As per the instructions on <a href="http://wiki.github.com/mojombo/jekyll/install">the Jekyll install page</a>,</p>

<pre><code>    gem install jekyll
</code></pre>

<p>then (since I didn&#8217;t have this set already):</p>

<pre><code>    export PATH=${PATH}:/Users/maier/.gem/ruby/1.8/bin
</code></pre>

<p>added to my Bash <code>~/.profile</code> file.</p>

<p><a href="http://pygments.org/">Pygments</a> is cool for syntax highlighting.  On a Mac with MacPorts, it&#8217;s as easy as this:</p>

<pre><code>    sudo port install python25 py25-pygments
</code></pre>

<p>Running Jekyll with its standalone server is great for testing your site locally:</p>

<pre><code>    jekyll --auto --server
</code></pre>

<p>That&#8217;ll run an embedded web server at <a href="http://localhost:4000">http://localhost:4000</a> (by default); anytime you change any of your site files, Jekyll will reprocess them and make them available immediately.  All you have to do is refresh your browser.</p>
]]></content>
  </entry>
  
</feed>
