<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jamie&#039;s Blog &#187; algorithms</title>
	<atom:link href="http://www.angelforge.org/wordpress/tags/algorithms/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.angelforge.org/wordpress</link>
	<description>My life is words.</description>
	<lastBuildDate>Sat, 28 Jan 2012 04:35:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Contest Fail</title>
		<link>http://www.angelforge.org/wordpress/programming/contest-fail/</link>
		<comments>http://www.angelforge.org/wordpress/programming/contest-fail/#comments</comments>
		<pubDate>Mon, 16 Jan 2012 12:00:52 +0000</pubDate>
		<dc:creator>Jamie</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[haskell]]></category>

		<guid isPermaLink="false">http://www.angelforge.org/wordpress/?p=3104</guid>
		<description><![CDATA[Last weekend I had a few hours to spare, so I decided to work on one of the problems from CodeSprint2, InterviewStreet&#8217;s contest to find the best programming talent. Since I didn&#8217;t have long to work, I decided to try to complete the algorithm problem with the highest value. I chose to work on Count [...]]]></description>
			<content:encoded><![CDATA[<p>Last weekend I had a few hours to spare, so I decided to work on one of the problems from <a href="http://codesprint.interviewstreet.com/">CodeSprint2</a>, <a href="http://www.interviewstreet.com/">InterviewStreet&#8217;s</a> contest to find the best programming talent.</p>
<p>Since I didn&#8217;t have long to work, I decided to try to complete the algorithm problem with the highest value. I chose to work on <a href="http://codesprint.interviewstreet.com/recruit/challenges/solve/view/4eff8a6c7ac39/4effe61080466">Count Strings (closed now)</a>. Given a regular expression, and a string length N, count the number of distinct strings of length N which the regular expression can match.</p>
<p>I looked at the sample test case, spent some time thinking about the problem and decided to try it. Although bounds for test cases were provided, I gave them only a cursory glance. This would turn out to be a mistake.</p>
<p>In my daily work, I typically focus on getting things working rather than spending time on determining the optimal way to do things. It is with this mindset that I decided that I would actually enumerate every string that a regular expression could produce, and then count them. I failed even before I began.</p>
<p>The regular expression language was limited to the alphabet {a, b}, with operators star, union, and concatenation. </p>
<p>I decided to use Haskell, which has the terseness of Ruby or Python, but the performance and type checking of a compiled language. I created a data type to represent the regular expressions. Recursive data types made this representation very straightforward. </p>
<pre class="brush: haskell; title: ; notranslate">
data RegularExpression = Symbol Char
  | Concat RegularExpression RegularExpression
  | Union RegularExpression RegularExpression
  | Star RegularExpression
  deriving (Show, Eq)
</pre>
<p>I had the enumeration part complete fairly quickly. Pattern matching makes everything really intuitive and readable. The star operation required the most code.</p>
<pre class="brush: haskell; title: ; notranslate">
listPossibilities (Symbol c) limit
  | limit &gt; 0 = [c]
  | otherwise = []

listPossibilities (Concat r1 r2) limit = combos where
  o1 = listPossibilities r1 limit
  o2 = listPossibilities r2 limit
  combos = [ a ++ b | a &lt;- o1, b &lt;- o2, length (a++b) &lt;= limit ]

listPossibilities (Union r1 r2) limit = possibilities where
  o1 = listPossibilities r1 limit
  o2 = listPossibilities r2 limit
  possibilities = o1 ++ o2

listPossibilities (Star _) 0 = [&quot;&quot;]
listPossibilities (Star r1) limit = possibilities where
  opt = listPossibilities r1 limit
  possibilities = &quot;&quot;: whileLimit opt limit optSet
  optSet = Set.fromList opt 

  -- | Uses nub right now, really inefficient. probably should use some sort of memoization
  whileLimit :: [String] -&gt; Int -&gt; [String] -&gt; [String]
  whileLimit base lim acc = pos where
    new = List.nub [a++b | a &lt;- base, b &lt;- acc, length (a++b) &lt;= lim]
    pos = if null new || length new &lt; length acc then acc
          else whileLimit base lim (List.nub $ acc ++ new)
</pre>
<p>The next component I needed was a parser to convert input into these regular expression objects. I used the Parsec parser module to construct the parser. This part actually took a bit longer, as I tried to refine the parser to accept deeply nested expressions.</p>
<p>The regular expression union parser below parses a parenthesized expression or symbol, followed by a pipe, and another parenthesized expression or symbol, returning a RegularExpression object.</p>
<pre class="brush: haskell; title: ; notranslate">
reUnion :: GenParser Char st RegularExpression
reUnion = do
  re1 &lt;- pexpr &lt;|&gt; reSymbol
  char '|'
  re2 &lt;- pexpr &lt;|&gt; reSymbol
  return (Union re1 re2)
</pre>
<p>Once I got the parser working, I glued it together with the enumeration portion, and submitted it to the contest. It immediately failed to compiled due to a dependency on Test.HUnit I had in my code. I reorganized the code so that I could easily remove this dependency, and resubmitted.</p>
<p>This time, it took, but when I went back after a few minutes to check the submission status, it had failed all but 1 of the test cases. It had taken too much time! I went back to the submission guidelines and noted that there was a time limit of 5 minutes.</p>
<p>Originally, I had used an algorithm which enumerated by checking distinctness using a List type. I knew at the time that this was very inefficient, but I focused on finishing the implementation at the time. I went back to this part and used a Set instead. I was optimistic that this would fix my problems and I would be able to knockout at least another test case.</p>
<p>I resubmitted and I failed 10 tests again. Then, I reread the problem, and looks at the boundaries for N. N could be up to 100,000 (IIRC)! Of course enumeration (especially the way I was doing it which would enumerate many strings of less than length N) would fail&#8211;I had taken a totally incorrect approach to this problem. I did not need to enumerate&#8211;I only needed to give a number of potential strings. This could&#8217;ve been done without enumeration. For example, given the expression a*, and a number N, there is only 1 possibility. Given <img src='http://s.wordpress.com/latex.php?latex=%28a%7Cb%29%5E%2A%2C%202%5EN&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(a|b)^*, 2^N' title='(a|b)^*, 2^N' class='latex' />. For example, one of the test cases was <img src='http://s.wordpress.com/latex.php?latex=%28a%7Cb%29%5E%2A%2C%20N%3D5&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(a|b)^*, N=5' title='(a|b)^*, N=5' class='latex' />, which has 32 possibilities. Another of the expressions was <img src='http://s.wordpress.com/latex.php?latex=a%5E%2Aba%5E%2A&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='a^*ba^*' title='a^*ba^*' class='latex' />. Since b can only appear once, the possible strings include strings of  <img src='http://s.wordpress.com/latex.php?latex=%28N-1%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(N-1)' title='(N-1)' class='latex' /> a&#8217;s and a single b. There are N such combinations (the number of positions b can be in, in an N length string): baa&#8230;, aba&#8230;, aab&#8230;, &#8230; . This is the approach I should&#8217;ve taken.</p>
<p>I left the solution there, and acknowledged my defeat. Source here: <a href="https://github.com/jamiely/count-strings">https://github.com/jamiely/count-strings</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.angelforge.org/wordpress/programming/contest-fail/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Word Jumble Game: Part 5</title>
		<link>http://www.angelforge.org/wordpress/programming/software/word-jumble-game-part-5/</link>
		<comments>http://www.angelforge.org/wordpress/programming/software/word-jumble-game-part-5/#comments</comments>
		<pubDate>Tue, 23 Mar 2010 03:23:08 +0000</pubDate>
		<dc:creator>Jamie</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[coldfusion]]></category>
		<category><![CDATA[css]]></category>
		<category><![CDATA[game]]></category>
		<category><![CDATA[game programming]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[jQuery]]></category>
		<category><![CDATA[puzzle]]></category>
		<category><![CDATA[regular expressions]]></category>
		<category><![CDATA[word game]]></category>
		<category><![CDATA[word jumble]]></category>

		<guid isPermaLink="false">http://www.angelforge.org/wordpress/?p=2394</guid>
		<description><![CDATA[I used jQuery for the UI. I am a recent convert to jQuery, having mostly used Prototype + Scriptaculous. The word list is embedded into the page script as a javascript array. On document ready, html is generated, which writes the first and last word to the page, and creates blank input boxes for the [...]]]></description>
			<content:encoded><![CDATA[<p>I used jQuery for the UI. I am a recent convert to jQuery, having mostly used Prototype + Scriptaculous.</p>
<p>The word list is embedded into the page script as a javascript array. On document ready, html is generated, which writes the first and last word to the page, and creates blank input boxes for the intermediate words.</p>
<p>There is a keyup event bound on each input box, which will determine if the word is correct. If it is, a css class will be added which shows a green underline underneath the box. Otherwise, a red underline will be shown.</p>
<p>Finally, there are buttons on the page which are created dynamically and provides hints or reveal all of the answers.</p>
<div id="attachment_2395" class="wp-caption aligncenter" style="width: 254px"><a href="http://www.angelforge.org/wordpress/wp-content/uploads/2010/03/wordjumble21.png"><img class="size-full wp-image-2395" title="Word Jumble Filled" src="http://www.angelforge.org/wordpress/wp-content/uploads/2010/03/wordjumble21.png" alt="Word Jumble Filled" width="244" height="425" /></a><p class="wp-caption-text">Word Jumble Filled</p></div>
]]></content:encoded>
			<wfw:commentRss>http://www.angelforge.org/wordpress/programming/software/word-jumble-game-part-5/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Word Jumble Game: Part 4</title>
		<link>http://www.angelforge.org/wordpress/programming/software/word-jumble-game-part-4/</link>
		<comments>http://www.angelforge.org/wordpress/programming/software/word-jumble-game-part-4/#comments</comments>
		<pubDate>Sun, 21 Mar 2010 02:23:07 +0000</pubDate>
		<dc:creator>Jamie</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[coldfusion]]></category>
		<category><![CDATA[game]]></category>
		<category><![CDATA[game programming]]></category>
		<category><![CDATA[puzzle]]></category>
		<category><![CDATA[regular expressions]]></category>
		<category><![CDATA[word game]]></category>

		<guid isPermaLink="false">http://www.angelforge.org/wordpress/?p=2392</guid>
		<description><![CDATA[Search The problem of generating the chain of clues is a simple search problem. In this case, depth-first search was used, because the algorithm would attempt path depth-wise and only explore another branch if the generated chain was not long enough. Another tactic would have been to use a breadth first search. To use breadth-first [...]]]></description>
			<content:encoded><![CDATA[<h2><span>Search</span></h2>
<p>The problem of  generating the chain of clues is a simple search problem. In this case,  depth-first search was used, because the algorithm would attempt path  depth-wise and only explore another branch if the generated chain was  not long enough.</p>
<p>Another tactic would have been to use a breadth  first search. To use breadth-first search, we could have modified the  regex pattern to find all words that differed from the base word by just  one letter.</p>
<p>Using water as the base word, that regular  expression looks something like: <strong>/([^w]ater|w[^a]ter|wa[^t]er|wat[^e]r|wate[^r])/</strong>.  This would find all words in the dictionary that differed by one word  (let&#8217;s call this word set B).</p>
<p>If we were using breadth-first  search, we would then repeat the process with all of the words we just  found (word set B).</p>
<p>If you were to visualize the difference  between breadth-first and depth-first search, breadth-first would look  like a tree with wide but shallow roots. Depth-first search would look  like a tree with few but deep roots.</p>
<h2><span>Query Params</span></h2>
<p>The flexibility of the puzzle is  enhanced by optional query parameters that may be applied. The <strong>word </strong>param  allows specification of the starting or seed word. The <strong>length </strong>param  specifies the maximum length of the puzzle.</p>
<h2><span>Recursion</span></h2>
<p>The program uses  recursion to perform the search. This almost goes without saying, for it  is difficult to do general search without recursion (although you could  do so with macros and similar programming constructs). Search may be  done using loop control structures but I can&#8217;t imagine an elegant  solution using loops.</p>
<p>The pseudocode for the recursion is  basically:</p>
<pre class="brush: jscript; title: ; notranslate">
function build(baseWord, chainWords, maxLength)

    regex = generateRandomRegex(baseWord)
    wordSetB = getPossibleWords(regex, notIn=chainWords)
    for(word in wordSetB)

        chain = build(word, chainWords+word, maxLength)
        if Length(chain) &gt;= maxLength

            break

    return chain
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.angelforge.org/wordpress/programming/software/word-jumble-game-part-4/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Word Jumble Game: Part 3</title>
		<link>http://www.angelforge.org/wordpress/programming/software/word-jumble-game-part-3/</link>
		<comments>http://www.angelforge.org/wordpress/programming/software/word-jumble-game-part-3/#comments</comments>
		<pubDate>Thu, 18 Mar 2010 15:17:47 +0000</pubDate>
		<dc:creator>Jamie</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[coldfusion]]></category>
		<category><![CDATA[game]]></category>
		<category><![CDATA[game programming]]></category>
		<category><![CDATA[puzzle]]></category>
		<category><![CDATA[regular expressions]]></category>
		<category><![CDATA[word game]]></category>
		<category><![CDATA[word jumble]]></category>

		<guid isPermaLink="false">http://www.angelforge.org/wordpress/?p=2365</guid>
		<description><![CDATA[The first thing I did was made sure that the word list would be cached on application start. This was as simple as creating an Application.cfc cfcomponent and implementing the onApplicationStart function.Â  This function reads the dictionary in (described in the last entry) and caches the word list in a ColdFusion array. There are other [...]]]></description>
			<content:encoded><![CDATA[<p>The first thing I did was made sure that the word list would be cached  on application start. This was as simple as creating an Application.cfc  cfcomponent and implementing the onApplicationStart function.Â  This  function reads the dictionary in (described in the last entry) and  caches the word list in a ColdFusion array. There are other options for  storing this data, but this had the best mix of speed and function  considering the method of search I wanted to use against it.</p>
<p>Although  the dictionary was only 52K, this caching probably helped performance a  great deal.</p>
<p>To generate the word list, I decided on the  following algorithm:</p>
<ol>
<li>Choose an initial starting word (at  random, or via user entry)</li>
<li>Use the word to generate a regular  expression.<br />
Replace a random single letter with the Regex pattern  [^L] (where L is the letter you have replaced).</p>
<p>Example:</p>
<p>word:  water<br />
regex: w[^a]ter</li>
<li>Next, iterate through all of the  words, testing each word against the regular expression. Store all  matches.</li>
<li>With each match, one-by-one, repeat Step 2 until we get  a chain of N words. (Where N is the maximum length of the chain.)</li>
<li>Obviously,  if we have no more matches, we stop. If we have at least a 3-word  chain, we can use it.</li>
</ol>
<p>There are a few considerations not  discussed above in generating the puzzle:</p>
<ul>
<li>If we match a word  that is already in the chain, we should ignore that word to avoid  duplicates.</li>
<li>Not implemented: we should not replace a letter in  the same position twice. For example, if we replace the &#8220;w&#8221; in water,  don&#8217;t replace the &#8220;h&#8221; hater (if hater is the 2nd word).</li>
<li>Depth-first  versus Breadth-first searching&#8230;to be discussed</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.angelforge.org/wordpress/programming/software/word-jumble-game-part-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Gesture Recognition</title>
		<link>http://www.angelforge.org/wordpress/programming/libraries/gesture-recognition/</link>
		<comments>http://www.angelforge.org/wordpress/programming/libraries/gesture-recognition/#comments</comments>
		<pubDate>Tue, 24 Mar 2009 18:20:12 +0000</pubDate>
		<dc:creator>Jamie</dc:creator>
				<category><![CDATA[Libraries]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[gestures]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://www.angelforge.org/wordpress/?p=1167</guid>
		<description><![CDATA[I came across an awesome article on simple gesture recognition. This page implements a &#8220;$1 Gesture Recognizer&#8221; that is easy, cheap, and usable almost anywhere. It requires under 100 lines of easy code and achieves 97% recognition rates with only one template defined for each gesture below. With 3+ templates defined, accuracy exceeds 99%. http://depts.washington.edu/aimgroup/proj/dollar/]]></description>
			<content:encoded><![CDATA[<p>I came across an awesome article on simple gesture recognition.</p>
<blockquote><p><em>This page implements a &#8220;$1 Gesture Recognizer&#8221; that is easy, cheap, and usable almost anywhere. It requires under 100 lines of easy code and achieves 97% recognition rates with only one template defined for each gesture below. With 3+ templates defined, accuracy exceeds 99%.</em></p>
<p><a href="http://depts.washington.edu/aimgroup/proj/dollar/">http://depts.washington.edu/aimgroup/proj/dollar/</a></p></blockquote>
<p><a href="http://depts.washington.edu/aimgroup/proj/dollar/"><br />
</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.angelforge.org/wordpress/programming/libraries/gesture-recognition/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

