<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Out Of What Box?</title>
	<atom:link href="http://www.outofwhatbox.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.outofwhatbox.com/blog</link>
	<description>Ruminations on software and other impossible things</description>
	<lastBuildDate>Mon, 08 Feb 2010 01:37:40 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>SyntaxHighlighter: Easier to load; faster to boot?</title>
		<link>http://www.outofwhatbox.com/blog/2010/02/syntaxhighlighter-easier-to-load-faster-to-boot/</link>
		<comments>http://www.outofwhatbox.com/blog/2010/02/syntaxhighlighter-easier-to-load-faster-to-boot/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 00:45:14 +0000</pubDate>
		<dc:creator>Dan Breslau</dc:creator>
				<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[syntaxhighlighter]]></category>

		<guid isPermaLink="false">http://www.outofwhatbox.com/blog/?p=625</guid>
		<description><![CDATA[I've introduced a new method for loading SyntaxHighlighter onto a web page. The goal was simply to make it easier to integrate SyntaxHighlighter into a web site. So, I was pleasantly surprised to find that the new method often loads faster than the traditional method.]]></description>
			<content:encoded><![CDATA[<p>SyntaxHighlighter has a relatively high surface area, typically requiring two CSS files and at least two JavaScript files to be linked into a web page. Here&#8217;s a truncated example:</p>
<pre class="brush: xml; auto-links: false;">
&lt;html&gt;
&lt;head&gt;
	&lt;!-- Stylesheets for SyntaxHighlighter --&gt;
&lt;link type=&quot;text/css&quot; rel=&quot;stylesheet&quot; href=&quot;styles/shCore.css&quot;/&gt;
&lt;link type=&quot;text/css&quot; rel=&quot;stylesheet&quot; href=&quot;styles/shThemeDefault.css&quot;/&gt;
...
&lt;/head&gt;
&lt;body&gt;

&lt;em&gt;...Some kind of interesting page content usually goes here,
but we're not interested in that right now...&lt;/em&gt;

	&lt;!-- Load the SyntaxHighlighter Core and Brush scripts. A
               separate Brush script is required for each language. --&gt;
&lt;script type=&quot;text/javascript&quot; src=&quot;scripts/shCore.js&quot;&gt;&lt;/script&gt;
&lt;script type=&quot;text/javascript&quot; src=&quot;scripts/shBrushJava.js&quot;&gt;&lt;/script&gt;
&lt;script type=&quot;text/javascript&quot; src=&quot;scripts/shBrushJScript.js&quot;&gt;&lt;/script&gt;
	&lt;!-- Don't forget the XML brush if you're using html-script --&gt;
&lt;script type=&quot;text/javascript&quot; src=&quot;scripts/shBrushXml.js&quot;&gt;&lt;/script&gt;

	&lt;!-- Now that all the script files are loaded, begin highlighting. --&gt;
&lt;script type=&quot;text/javascript&quot;&gt;
SyntaxHighlighter.config.stripBrs = true;
// ... additional configuration, as needed ...

SyntaxHighlighter.all();
&lt;/script&gt;

&lt;/body&gt;
</pre>
<p>Some users have wanted a simpler way to integrate SyntaxHighlighter into their sites. Last year, <a href="http://davidchambersdesign.com/prototype-loader-for-syntaxhighlighter/">David Chambers</a> wrote a script that uses the <a href="http://www.prototypejs.org/">Prototype library</a> to load the required CSS and brush files. He ran into a troubling quirk in SyntaxHighlighter: It needs any required brush files to be loaded before it starts highlighting. I responded by updating the <a href="http://www.outofwhatbox.com/blog/2009/12/update-of-forked-syntaxhighlighter/">OOWB fork</a> of SyntaxHighlighter, so that <a href="http://www.outofwhatbox.com/blog/2009/06/occams-razor-and-brushes/">brushes could be loaded asynchronously</a>. But that&#8217;s as far as I&#8217;d gone towards automated loading, until now.</p>
<p>I recently looked into adding automated loading to SyntaxHighlighter itself, coming up with not one, but two automated solutions. I should really say one and a half, as one is still in the alpha stage and will likely stay there.</p>
<p>The &#8220;alpha-stage&#8221; approach comprises a new JavaScript file that loads SyntaxHighlighter and the necessary brush files through AJAX, by way of a pair of simple Java servlets. I&#8217;d hoped that this would lead to faster load times, by reducing the total number of downloads. And so it does—sometimes. For testing, I used <a href="http://code.google.com/appengine/">Google App Engine</a> as the servlet host (using the free quotas.) Performance was inconsistent, to say the least; this is probably due to the Google App Engine&#8217;s method of starting up and shutting down servlets. It seemed that only by making repeated hits on the servlets in rapid succession could I be sure of getting to a container that already had my servlets running, thus getting good load times.<sup><a href="#shBoot-1" name="shBoot-1-ref">1</a></sup> The servlets are still up there; if you&#8217;d like the Java source (to host them elsewhere), and/or the JavaScript for using them, <a href="http://www.outofwhatbox.com/blog/contact/">just let me know.</a></p>
<p>I also implemented a pure JavaScript approach, which I <em>hadn&#8217;t</em> expected to yield better load times. Yet I&#8217;ve seen a number of cases where this is faster than the original, static way of loading SyntaxHighlighter. I&#8217;m releasing a <a href="http://www.outofwhatbox.com/blog/syntaxhighlighter-downloads/">new download</a> so that you can try it out. I highly recommend that you test it thoroughly before putting it into real use.</p>
<p>To use this new method, the web page invokes <code>SyntaxHighlighter.boot</code> instead of <code>SyntaxHighlighter.all</code>. When <code>boot</code> is used as the entry point, SyntaxHighlighter determines which brushes are required by the web page, and issues HTTP requests for each brush file (as well as the appropriate CSS.) These requests use the age-old method of adding a <code><script></script></code> tag for each brush, and a <code>
<link></link></code> tag for each CSS file.</p>
<p>Using <code>SyntaxHighlighter.boot</code>, a web page only needs to add a couple of scripts, usually at the end of the <code><body></body></code>. No CSS link elements are required.</p>
<pre class="brush: xml; auto-links: false;">
&lt;!--  We just need to load shCore.js near the end of the &lt;body&gt; element,
	then call SyntaxHighlighter.boot() with our configuration options. --&gt;

&lt;script type=&quot;text/javascript&quot;
src=&quot;http://path/to/sh/directory/syntaxhighlighter/scripts/shCore.js&quot;&gt;&lt;/script&gt;

&lt;script type=&quot;text/javascript&quot;&gt;
	SyntaxHighlighter.boot(
		&quot;http://path/to/syntaxhighlighter/root/syntaxhighlighter/&quot;,
		{theme : &quot;Default&quot;}, // Configuration settings
		{stripBrs : true}      // Default settings
	);
&lt;/script&gt;

&lt;/body&gt;
</pre>
<p>That&#8217;s it. Every page on your site can use the same set-up, regardless of which brush files it needs. (For another example, look at the HTML source for this page.)</p>
<p>When an HTML page includes external scripts—that is, <code><script></script></code> elements with <code>src</code> attributes, the browser loads the script files synchronously, as it encounters them in the HTML. When the scripts are loaded dynamically, the browser can load them asynchronously; several scripts may then be loading simultaneously. This can be particularly helpful for pages that use multiple brushes. Here&#8217;s a Firebug sequence diagram of a web page with five brushes, loading in SyntaxHighlighter&#8217;s traditional way:</p>
<div class="oowbcenter"><a target="blank" href="http://img.outofwhatbox.com/shLoading/stdLoadTimeLarge.png"><img alt="" title="" src="http://img.outofwhatbox.com/shLoading/stdLoadTimeSmall.png" /></a>
<p class="oowb-caption-text">(Click on image for larger view)</p>
</div>
<p>Here&#8217;s the same page, modified to use the <code>boot</code> method. Notice that, unlike in the first picture, the <code>shBrush</code>..<code>.js</code> files are loaded in parallel:</p>
<div class="oowbcenter"><a target="blank" href="http://img.outofwhatbox.com/shLoading/bootLoadTimeLarge.png"><img alt="" title="" src="http://img.outofwhatbox.com/shLoading/bootLoadTimeSmall.png" /></a>
<p class="oowb-caption-text">(Click on image for larger view)</p>
</div>
<p>Of course, <a href="http://yuiblog.com/blog/2008/07/22/non-blocking-scripts/">this isn&#8217;t an entirely new discovery</a>, but it came as a pleasant surprise nonetheless.</p>
<h4>Availability</h4>
<p>In addition to SyntaxHighlighter itself, I also modified Viper007Bond&#8217;s <a href="http://wordpress.org/extend/plugins/syntaxhighlighter/">SyntaxHighlighter Evolved</a> plugin for WordPress to use the new <code>boot</code> method. This simplified the code quite a bit, though it may have introduced bugs.</p>
<p>SyntaxHighlighter and SyntaxHighlighter Evolved are each available on the <a href="http://www.outofwhatbox.com/blog/syntaxhighlighter-downloads/">downloads page</a>. <strong>Test before using.</strong> <em>Note: If you&#8217;ve been using SyntaxHighlighter Evolved, you will probably need to re-set your settings if you install this plugin.</em></p>
<h4>Implementation notes</h4>
<p>In the traditional SyntaxHighlighter, all brushes must be loaded before <code>SyntaxHighlighter.all</code> is invoked. In this fork, <a href="<a href="http://www.outofwhatbox.com/blog/2009/06/occams-razor-and-brushes/">&#8220;>brushes can load at any time</a> after <code>shCore.js</code> has been loaded. This flexibility was key to the design of the <code>boot</code> method: If a brush is loaded after <code>SyntaxHighlighter.all</code> or <code>SyntaxHighlighter.boot</code> is invoked, <code>SyntaxHighlighter</code> looks through the page to see if there&#8217;s any input requesting the newly loaded brush.</p>
<p>The <code>boot</code> method needs to map each brush name to the script file that implements that brush. I implemented this by modifying the Perl script that builds and stages SyntaxHighlighter. The script now parses the *Brush.js files in the source tree, pulling out the names and aliases supported by each file, and writes them into an array in <code>shCore.js</code>. The downside of this is that testing via the <code>boot</code> method requires re-running the script, which builds and stages the files into a new directory, from where I can run tests. Even though the script only takes about a second, it&#8217;s crossed the boundary between having <em>no</em> build step in the development cycle, and having <em>any</em> build step in the development cycle.</p>
<h3> When You Come To A Fork In The Road, Take It.</h3>
<p>I had two motives for working on SyntaxHighlighter: One, to improve its display of code on my own blog and elsewhere. Two, as a training ground for learning JavaScript.</p>
<p>I was between jobs when I took on this project. As of this Monday, that will no longer be the case. That&#8217;s great news for me, but much as I&#8217;d like to keep the SyntaxHighlighter work moving forward, I don&#8217;t really know if or when I&#8217;ll have the time for it. On the other hand, I no longer feel like I need a private JavaScript training ground. So, I&#8217;ve pushed <a href="http://bitbucket.org/dbreslau/syntaxhighlighter/">my repository</a> to bitbucket.org as a fork of <a href="http://bitbucket.org/alexg/syntaxhighlighter/">Alex&#8217;s repository</a>. If he so chooses (and so far he hasn&#8217;t, which is fine), Alex is welcome to merge the fork into his branch. Perhaps more importantly, anyone else can now fork the code, and/or make contributions to this fork.</p>
<p>I do have some more goals in mind for SyntaxHighlighter. I&#8217;ll be writing about them in the not-too-distant future.</p>
<div>
<hr /></div>
<p><sup><a href="#shBoot-1-ref" name="shBoot-1">1</a></sup>Perhaps a faster host would lead to the faster load times I&#8217;d hoped for; but that would probably require payment, which makes this approach considerably less attractive.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.outofwhatbox.com/blog/2010/02/syntaxhighlighter-easier-to-load-faster-to-boot/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Update of Forked SyntaxHighlighter</title>
		<link>http://www.outofwhatbox.com/blog/2009/12/update-of-forked-syntaxhighlighter/</link>
		<comments>http://www.outofwhatbox.com/blog/2009/12/update-of-forked-syntaxhighlighter/#comments</comments>
		<pubDate>Fri, 18 Dec 2009 17:20:19 +0000</pubDate>
		<dc:creator>Dan Breslau</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[syntaxhighlighter]]></category>

		<guid isPermaLink="false">http://www.outofwhatbox.com/blog/?p=613</guid>
		<description><![CDATA[I've updated this site's fork of the SyntaxHighlighter software by Alex Gorbatchev. These features a partial merge from Alex's 2.1.364 release. I've also added a "Hide" button to the toolbar.]]></description>
			<content:encoded><![CDATA[<p>As promised, I&#8217;ve updated this site&#8217;s fork of the <a href="http://alexgorbatchev.com/wiki/SyntaxHighlighter">SyntaxHighlighter</a> software by Alex Gorbatchev. I&#8217;ve merged in many of the changes that Alex included in his 2.1.364 release, but with the following differences:</p>
<ul>
<li>In 2.1.364, the ruler functionality was removed. It&#8217;s still included here.</li>
<li>2.1.364 changed the way that wrapped lines are presented. I&#8217;ve kept the old presentation, but modified the icon that&#8217;s used to signify wrapping.</li>
<li>Internally, 2.1.364 uses a separate HTML table to represent each line of source.  I&#8217;ve maintained the older pattern of simply using a separate <code>&lt;div&gt;</code> to represent each line.</li>
</ul>
<p>Other new bits in this release:
<ul>
<li>I&#8217;ve added a &#8220;Hide&#8221; button to the toolbar. Code blocks can now be expanded/collapsed at the user&#8217;s whim. (As before, an author can still choose to present a block in the hidden/collapsed state.)</li>
<li><del datetime="2009-12-18T19:16:34+00:00">When both a horizontal scrollbar and a gutter (the line number column) are displayed, the scrollbar no longer extends under the gutter. (I <em>think</em> this is an improvement; as always, feedback is welcome.)</del> <em>Update: I&#8217;ve backed out this change; it was causing problems with IE7.</em></li>
</ul>
<p>The ZIP file is available through the <a href="http://www.outofwhatbox.com/blog/syntaxhighlighter-downloads/">Download Page</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.outofwhatbox.com/blog/2009/12/update-of-forked-syntaxhighlighter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Do not adjust your eyes; it&#8217;s just a new theme.</title>
		<link>http://www.outofwhatbox.com/blog/2009/12/do-not-adjust-your-eyes-its-just-a-new-theme/</link>
		<comments>http://www.outofwhatbox.com/blog/2009/12/do-not-adjust-your-eyes-its-just-a-new-theme/#comments</comments>
		<pubDate>Thu, 17 Dec 2009 05:43:33 +0000</pubDate>
		<dc:creator>Dan Breslau</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[syntaxhighlighter]]></category>

		<guid isPermaLink="false">http://www.outofwhatbox.com/blog/?p=606</guid>
		<description><![CDATA[I&#8217;ve just gone live with a different WordPress theme. So, if you&#8217;re feeling a bit lost, don&#8217;t panic. This is a new theme, iTech, which was just released last month. I&#8217;ve tailored the CSS, fixed a bug or two in the php, and even managed to add a new customization option. It&#8217;s been a good [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve just gone live with a different <a href="http://codex.wordpress.org/Using_Themes">WordPress theme</a>. So, if you&#8217;re feeling a bit lost, don&#8217;t panic. This is a new theme, <a href="http://zacklive.com/itech-theme-free-wordpress-theme-for-gadgets-and-tech-blogs/690/">iTech</a>, which was just released last month. I&#8217;ve tailored the CSS, fixed a bug or two in the php, and even managed to add a new customization option. It&#8217;s been a good learning experience. Most importantly, I hope you&#8217;ll like the new look. Let me know what you think.</p>
<h4>New version of SyntaxHighlighter coming</h4>
<p>Some folks have asked me offline if I&#8217;m going to be merging Alex Gorbatchev&#8217;s 2.1.364 version of <a href="http://alexgorbatchev.com/wiki/SyntaxHighlighter">SyntaxHighlighter</a> with the <a href="http://www.outofwhatbox.com/blog/tag/syntaxhighlighter/">OOWB fork</a> that I&#8217;ve been supporting here. The answer to that is yes; if all goes well, I should be posting it in the next few days. As for creating a public repository, perhaps on <a href="http://bitbucket.org/">bitbucket</a>: Maybe. If you&#8217;d like to see this happen, please let me know. Actually, I&#8217;d like to hear from anyone who&#8217;s rolled out the OOWB fork: What site(s) are you using it on, and how is it working out? Please leave a comment, or <a href="http://www.outofwhatbox.com/blog/contact/">send email</a>.</p>
<p>Did I say &#8220;fork&#8221;? Uh, yeah. While 2.1.364 is solid, it doesn&#8217;t cover most of the improvements that I&#8217;ve made (check the <a href="http://www.outofwhatbox.com/blog/syntaxhighlighter-downloads/">Download Page</a> to see a summary.) Ultimately, I&#8217;d like to see these changes rolled back into Alex&#8217;s version, but that doesn&#8217;t appear to be happening anytime soon. There could be good reasons for that; certainly my version looks larger, just for starters. (I <em>think</em> that the <em>total</em> footprint may be smaller, but that&#8217;s a tough call.) All I can really say is that we&#8217;ll see how it goes. I plan to keep making improvements in the software, at least for my own sake; but I regard Alex as the primary author and will continue to do so.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.outofwhatbox.com/blog/2009/12/do-not-adjust-your-eyes-its-just-a-new-theme/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>JavaScript Array Performance, And Why It Matters</title>
		<link>http://www.outofwhatbox.com/blog/2009/12/javascript-array-performance-and-why-it-matters/</link>
		<comments>http://www.outofwhatbox.com/blog/2009/12/javascript-array-performance-and-why-it-matters/#comments</comments>
		<pubDate>Tue, 01 Dec 2009 15:21:13 +0000</pubDate>
		<dc:creator>Dan Breslau</dc:creator>
				<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[JavaScript arrays]]></category>
		<category><![CDATA[JavaScript performance]]></category>

		<guid isPermaLink="false">http://www.outofwhatbox.com/blog/?p=588</guid>
		<description><![CDATA[Arrays can present an unexpected performance bottleneck in JavaScript. Here I show that array performance is influenced by some surprising details of the array and how it's used.]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.outofwhatbox.com/blog/2009/11/javascript-array-performance-initialize-to-optimize/">My last post</a> described how a JavaScript array had become a performance bottleneck. Here, I&#8217;ll delve further into how some JavaScript programs become array-bound, and how to break that bind when you need to.</p>
<p>Let&#8217;s start with this:</p>
<div class="oowbcenter" ><img title="JavaScript array elements can be much slower than scalar variables" src="http://img.outofwhatbox.com/JSArrayOptimizationII/ArraysVsScalars.png" alt="JavaScript array elements can be much slower than scalar variables" /></div>
<p>As the graph shows, JavaScript array performance ranges from OK (nearly equivalent to scalars) to awful. The reasons for poor performance vary, but most can be boiled down to this: The JavaScript interpreter is essentially left guessing about how an array will be used. And guess it does. When an array is <a href="http://www.outofwhatbox.com/blog/2009/11/javascript-array-performance-initialize-to-optimize/">constructed and initialized</a>, the interpreter can observe where data is stored into a new array. Certain kinds of patterns may nudge it to optimize for better speed, or lower memory consumption. So, performance suffers when the interpreter makes the wrong interpretation.</p>
<p>I looked into performance of JavaScript arrays with three popular Windows browsers: Internet Explorer 8, Google Chrome 3.0, and Firefox 3.5. Broadly speaking, these all seem to look for two or three types of arrays:</p>
<ul>
<li><em>Dense arrays</em>: These provide faster access to individual elements of the array.</li>
<li><em>Sparse arrays</em>: These are typically optimized for lower memory use.
</li>
<li><em>Sized arrays</em>: A variant of sparse arrays that are optimized for speed for certain cases.</li>
</ul>
<p>The interpreters optimize for dense arrays if the array&#8217;s elements are initialized in a continuous range, starting at index 0. This initialization can be done before there&#8217;s any data for the array, by using placeholders such as <code>0</code>, <code>null</code>, or even <code>undefined</code>. (There are other ways to get this optimization; this is simply the most reliable approach.)</p>
<p>Passing the array&#8217;s size in the constructor may also bring better performance. In my testing, I saw this only with sparse arrays in IE8 and (perhaps) in Chrome. I&#8217;ll refer to these as <em>sized arrays</em>; see <a href="#jsarraysII-sized">below</a> for some additional notes.</p>
<p>If an array is created as neither a dense nor a sized array, it&#8217;s treated as a sparse array.</p>
<p>As a general rule, the array&#8217;s behavior is established shortly after construction. If you sparsely populate an array, then assign other values to the remaining elements, you&#8217;re left with a densely-populated sparse array.<a href="#jsarraysII-note-1" name="jsarraysII-ref-1"><sup>1</sup></a></p>
<div class="oowbsidebar">
Hold it there. A <em>densely populated sparse array</em>? Holy <a href="http://www.joelonsoftware.com/articles/LeakyAbstractions.html">leaky abstractions</a>, Batman!</p>
<p>Perhaps we&#8217;re lacking some terms. If we&#8217;re trying to coax better performance from a <em>sparse</em> array by filling it up with placeholder values, that doesn&#8217;t make it a <em>dense</em> array—not from your program&#8217;s perspective, at least. To describe it from the interpreter&#8217;s perspective, I’ll refer to an array that’s initialized after creation—with live or dead data—as a <em>cleared</em> array. An uninitialized array that is populated at random indices is a <em>default</em> array, regardless of how dense it eventually becomes. Finally, a <em>sized</em> array is, well, a sized array. So, that &#8220;densely populated sparse array&#8221; is now a &#8220;densely populated default array.&#8221; That&#8217;s still an awkward phrase, but at least it&#8217;s not an oxymoron.</div>
<p>Now that we can finally get to some data, let&#8217;s discuss scalability. This graph represents the time taken to read about 30,000,000 values from arrays of various sizes, using three Windows browsers. Seven data sets are presented: Cleared and default arrays for each of IE8, Google Chrome 3.0, and Firefox 3.5, plus sized arrays for IE8.</p>
<div class="oowbcenter" ><a href="http://img.outofwhatbox.com/JSArrayOptimizationII/ArraySizeVsRunTimeLarge.png" target="blank"><img title="As array size increases, run time may increase as well. (Click for larger graph with legend)" src="http://img.outofwhatbox.com/JSArrayOptimizationII/ArraySizeVsRunTimeSmall.png" alt="As array size increases, run time may increase as well. (Click for larger graph with legend)" /></a>
<p class="oowb-caption-text">Array Size Vs. Access Time. (Click for larger graph with legend)</p>
</div>
<p>Note that in three of the seven cases, time grows linearly as size increases<a href="#jsarraysII-note-2" name="jsarraysII-ref-2"><sup>2</sup></a>. This growth effectively multiplies the program&#8217;s <a href="http://en.wikipedia.org/wiki/Big_O_notation">complexity</a> by O(N). That is, an algorithm that might normally have O(N) performance instead shows O(N<sup>2</sup>) performance, and so on. This can have a sizable impact on scalability.</p>
<p>The graph makes it clear that the fastest arrays are cleared. However, making a very sparse cleared array would gobble up a large swath of memory for a small number of values. For such cases, you may want to stick with sparse or sized arrays. Here&#8217;s a closer look at their performance.</p>
<p>The following tests all used arrays of 180,000 elements. Each data point represents performance with a different &#8220;hit ratio&#8221;; that is, the ratio of defined values read from the array. While array density also varied, its effect appears in only one case, which I&#8217;ll show separately<a href="#jsarraysII-note-3" name="jsarraysII-ref-3"><sup>3</sup></a>. For reference, these graphs also show the performance with cleared arrays.</p>
<p>This graph shows the performance of sized, default, and cleared arrays in IE8. Notice that the &#8220;sized&#8221; trendline (in blue) begins significantly above the &#8220;default&#8221; trendline, and ends slightly above the &#8220;cleared&#8221; trendline. Clearly, IE optimizes sized arrays to work better when reading defined values.</p>
<div class="oowbcenter" ><a href="http://img.outofwhatbox.com/JSArrayOptimizationII/IEHitsLarge.png" target="blank"><img title="Performance of defined and undefined array elements in IE8. (Click for larger graph with legend)" src="http://img.outofwhatbox.com/JSArrayOptimizationII/IEHitsSmall.png" alt="Performance of defined and undefined array elements in IE8. (Click for larger graph with legend)" /></a>
<p class="oowb-caption-text">Array Hits Vs. Access Time in IE8. (Click for larger graph with legend)</p>
</div>
<p>Firefox 3.5 also looks slower when accessing undefined elements, but there&#8217;s no real gain from passing a size parameter to the array constructor:</p>
<div class="oowbcenter" ><a href="http://img.outofwhatbox.com/JSArrayOptimizationII/FFHitsLarge.png" target="blank"><img title="Performance of defined and undefined array elements in Firefox 3.5. (Click for larger graph with legend)" src="http://img.outofwhatbox.com/JSArrayOptimizationII/FFHitsSmall.png" alt="Performance of defined and undefined array elements in Firefox 3.5. (Click for larger graph with legend)" /></a>
<p class="oowb-caption-text">Array Hits Vs. Access Time in Firefox 3.5. (Click for larger graph with legend)</p>
</div>
<p>In Chrome, array access times are clustered into high and low ranges of values, based on array density. Chrome performs markedly better with arrays having a density of 10% or higher, vs. arrays of lower density. The data sets below have been split accordingly. Since there&#8217;s much more variation <em>between</em> these two ranges than here is <em>among</em> them, it&#8217;s a good bet that the 10% threshold is hard-coded somewhere in Chrome&#8217;s V8 JavaScript interpreter.</p>
<div class="oowbcenter" ><a href="http://img.outofwhatbox.com/JSArrayOptimizationII/ChromeSplitLarge.png" target="blank"><img title="Array Hits Vs. Access Time in Chrome, by array density. (Click for larger graph with legend)" src="http://img.outofwhatbox.com/JSArrayOptimizationII/ChromeSplitSmall.png" alt="Array Hits Vs. Access Time in Chrome, by array density. (Click for larger graph with legend)" /></a>
<p class="oowb-caption-text">Array Hits Vs. Access Time in Chrome, by array density. (Click for larger graph with legend)</p>
</div>
<p><a name="jsarraysII-sized"></a><br />
<h3>Notes on sized arrays</h3>
<p>It makes sense that <i>if</i> a size is passed to the array constructor, <i>and</i> the size value is accurate, then the interpreter can optimize the array for that size. The problem is that the size value won&#8217;t always be accurate; nor does it suggest how dense the array might become. Hence the interpreter may not be able to rely on the size value even when it&#8217;s present.</p>
<p>Sized array behavior in IE8 isn&#8217;t what I&#8217;d expected. I&#8217;d found a note written prior to IE8&#8217;s release by one of its developers, which I had taken to mean that <a href="http://blogs.msdn.com/jscript/archive/2008/03/25/performance-optimization-of-arrays-part-i.aspx">these arrays should perform like cleared arrays</a>, but that&#8217;s clearly not the case. Sized arrays in IE8 actually incur a small <em>penalty</em> when accessing undefined array elements, to the point where if you access <em>only</em> undefined elements, a sized array may be slower than a default array. On a closer reading, the note refers to &#8220;any <em>indexed</em> entry&#8221; within the array&#8217;s range <em>[emphasis mine.]</em> Of course, undefined entries wouldn&#8217;t be indexed. The note was accurate, but arguably didn&#8217;t go far enough.</p>
<p>In Chrome, using an explicit size for a sparse array does seem to make a small but measurable difference in some cases. But on the whole, there&#8217;s little reason to use this for improved performance, especially since Chrome is currently the browser <em>least in need</em> of a performance boost in these tests. </p>
<p>The performance of Firefox 3.5 suggests that it ignores the constructor&#8217;s size parameter. However, a quick trip through the <a href="http://hg.mozilla.org/releases/mozilla-1.9.1">browser&#8217;s source code</a> indicates that the size <i>should</i> make a difference. Perhaps other kinds of tests would be able to draw this out, or perhaps there&#8217;s an opportunity for improvement in the code.</p>
<h3>Memory Consumption</h3>
<p>Measuring the physical size of JavaScript data is a sketchy undertaking. About the best one can do is to compare the virtual memory size of the browser process before and after creating a large array. This isn&#8217;t going to be very accurate. All the same, where there&#8217;s a large difference between values, it&#8217;s probably significant.</p>
<table style="" border="0" cellspacing="0" cellpadding="2" >
<tbody>
<tr class="oowbfirstRow">
<th style="text-align: left;" ><strong><code>Browser</code></strong></th>
<th style="text-align: center;" ><strong><code>Size Estimate, Default / Sized Arrays (Bytes)</code></strong></th>
<th style="text-align: center;" ><strong><code>Size Estimate, Cleared Arrays (Bytes)</code></strong></th>
</tr>
<tr>
<td><strong>Internet Explorer 8</strong></td>
<td style="text-align: left;"><em>(# of values)</em> x (approximately 76)</td>
<td style="text-align: left;"><em>(Length of array)</em> x (approx. 46)</td>
</tr>
<tr>
<td><strong>Firefox 3.5</strong></td>
<td style="text-align: left;"><em>(# of values)</em> x (approx. 63)</td>
<td style="text-align: left;"><em>(Length of array)</em> x (approx. 5)</td>
</tr>
<tr>
<td><strong>Google Chrome 3.0</strong></td>
<td style="text-align: left;"><em>(Length of array)</em> x (30–70)</td>
<td style="text-align: left;"><em>(Length of array)</em> x (approx. 14)</td>
</tr>
</tbody>
</table>
<p>In my last post, I wrote that a sparse 12K array of integers, implemented as a hash table, would likely consume over 36K of memory. If these measurements are reasonable, that was a gross understatement; a sparse 12K array could actually consume more than 70 bytes per element, or upwards of 860K.</p>
<p>At the extremes, for two arrays with the same length, one that&#8217;s very dense may use <em>less</em> memory than one of lower density—even though the lower density array, by definition, contains fewer values. Chrome and Firefox seem to account for this internally as they organize the array structure, but I&#8217;m not sure whether IE8 does.</p>
<h3>Recommendations</h3>
<div class="oowbbtw">Remember that premature optimization is folly, and all optimization has its costs. If you use the hints that I&#8217;ve described, keep in mind that the interpreter isn&#8217;t obliged to obey your intentions. Consider creating a factory method for arrays, so that <del>if</del> <ins>as</ins> the rules change, you can most easily adapt your code to suit.</div>
<ul>
<li>Use cleared arrays if speed is critical, <strong>or</strong> if the array reaches around 50% density. But don&#8217;t use them habitually, especially not for sparse arrays.</li>
<li>Because of IE8&#8217;s quirks, you should think twice before creating sized arrays. They&#8217;re helpful <strong>only if</strong> you have sparse data <strong>and</strong> you won&#8217;t often access an undefined array element.</li>
</ul>
<h3>Looking ahead</h3>
<p>JavaScript&#8217;s arrays are pleasant enough in normal use; but like all abstractions, they have their leaks. As JavaScript is used for increasingly sophisticated applications, it might be worthwhile for its designers to take a fresh look at scalability. There are ways to get the desired results, but negotiation via secret handshake doesn&#8217;t scale terribly well.</p>
<p>Without marring JavaScript&#8217;s simplicity, it should be possible to extend the language so that, <em>when necessary</em>, the developer can make plain to the interpreter what it should expect for a particular array. For example, this could mean adding APIs or syntax that let the developer declare the array&#8217;s expected size, density, and so forth. Or the problem could be addressed from the other direction: Simply allow the developer to request a structure that favors higher speed, or more compactness, for a particular array.</p>
<p>And although read-only objects may be a special case, I started down this path by looking at an <a href="http://www.outofwhatbox.com/blog/2009/11/trimming-trim-via-razing-arrays-javascript/">array used as a lookup table</a> which is effectively read-only after initialization. It would be nice if I could let the interpreter know this. A version of <a href="http://ruby-doc.org/core/classes/Object.html#M000356">Ruby&#8217;s <code>freeze</code> method</a> would fit the bill. This would give the interpreter a hint to optimize the array for read-only access, though it wouldn&#8217;t be required to do so.</p>
<p>Dreaming further, I&#8217;m also holding out hope that closures can be better optimized. After initialization, the lookup table in my version of <code>String.trim()</code> was only accessible <a href="http://www.outofwhatbox.com/blog/2009/11/trimming-trim-via-razing-arrays-javascript/">to a single, read-only method</a>. If the interpreter can verify that nothing&#8217;s going to change the array, it could move it to faster storage. Yes, this brings us back around to secret handshakes. But since closures have their own merits, I see this more as the icing on the cake.</p>
<p />
<hr />
<div><a name="jsarraysII-note-1" href="#jsarraysII-ref-1"><sup>1</sup></a> Making a fresh copy using <code>slice(0)</code> should get you better performance. (Well, it&#8217;s worked for me, but I make no promises.)</p>
<p><a name="jsarraysII-note-2" href="#jsarraysII-ref-2"><sup>2</sup></a> There are really four cases, but the fourth one (cleared arrays in IE8) shows <em>very</em> slow growth in runtime. Also note that the numbers suggest that Chrome&#8217;s performance <em>improves</em> slightly as array size increases. This may be an artifact of the benchmark.</p>
<p><a name="jsarraysII-note-3" href="#jsarraysII-ref-3"><sup>3</sup></a> Density and hit ratio can be covariant, but here they aren&#8217;t. That is, these tests were designed so that, as long as the density was under 100%, the hit ratio could vary independently.</div>
]]></content:encoded>
			<wfw:commentRss>http://www.outofwhatbox.com/blog/2009/12/javascript-array-performance-and-why-it-matters/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>JavaScript Array Performance: Initialize to Optimize</title>
		<link>http://www.outofwhatbox.com/blog/2009/11/javascript-array-performance-initialize-to-optimize/</link>
		<comments>http://www.outofwhatbox.com/blog/2009/11/javascript-array-performance-initialize-to-optimize/#comments</comments>
		<pubDate>Tue, 17 Nov 2009 16:39:22 +0000</pubDate>
		<dc:creator>Dan Breslau</dc:creator>
				<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[JavaScript arrays]]></category>
		<category><![CDATA[JavaScript performance]]></category>

		<guid isPermaLink="false">http://www.outofwhatbox.com/blog/?p=584</guid>
		<description><![CDATA[The JavaScript interpreters in the most popular browsers distinguish between sparse and dense arrays. Hence, you may get more  "array-like" performance if you initialize an array before using it. But this trades memory for speed, so it may not always be the best choice.]]></description>
			<content:encoded><![CDATA[<p>After delving into the issue of <a href="http://www.outofwhatbox.com/blog/2009/11/trimming-trim-via-razing-arrays-javascript/">JavaScript array performance</a>, I came upon a <a href="http://blogs.msdn.com/jscript/archive/2008/04/08/performance-optimization-of-arrays-part-ii.aspx">pair of</a> <a href="http://blogs.msdn.com/jscript/archive/2008/03/25/performance-optimization-of-arrays-part-i.aspx">blog entries</a> on MSDN addressing the topic. These posts describe situations where the then-forthcoming IE8 could provide more performant arrays. The blog also partially confirms, and partially refutes, my thoughts on how JavaScript interpreters handle arrays. The blog&#8217;s focus, of course, is the JScript engine in IE, but the advice that it offers seems to work well with other major browsers.</p>
<p>The blog confirms that IE&#8217;s JavaScript interpreter manages arrays using a nonlinear structure, essentially a hash table. I&#8217;d guessed at this after seeing lower performance in <code><a href="http://www.outofwhatbox.com/blog/2009/11/trimming-trim-via-razing-arrays-javascript/">trimOOWB</a></code> when its lookup table was implemented as a large, and sparse, array. But contrary to my guess, this isn&#8217;t about avoiding re-allocation bottlenecks. It&#8217;s for dealing with <a href="http://www.dragonthoughts.biz/technical/sparsearray.html">sparse arrays</a> without consuming unreasonable amounts of memory<sup><a href="#jsarray-init-note-1" name="jsarray-init-ref-1">1</a></sup>. (I&#8217;m feeling a little sheepish about this: Even though I&#8217;d noted the sparseness of the lookup table, I hadn&#8217;t made the connection. In fact, I&#8217;d been somewhat dismissive of that possibility. <a href="http://www.outofwhatbox.com/blog/2009/04/how_to_make_mistakes/">Live and learn</a>.)</p>
<p>The blog also says that as of IE8, the JScript engine has heuristics for determining if an array is &#8220;dense&#8221;. It implements a dense array with a linear, array-like index in addition to its standard hash-like data structure; this index can make random access into the array much faster. The blog indicates that IE8 considers an array to be dense if <strong>either</strong> of the following conditions is true:</p>
<ul>
<li>You construct the array with an explicit size, and you don&#8217;t grow the array past that initial size.</li>
<li>After creating the array, you initialize a continuous range of indices, starting from 0, up to and including the highest index that you expect to use. You must do this before writing into random indices within the array.<sup><a href="#jsarray-init-note-2" name="jsarray-init-ref-2">2</a></sup></li>
</ul>
<p>As the writer says, using either of these two techniques should <em>ensure</em> that the array heuristics in IE 8 will treat your array as a dense array. The interpreter is free to decide to treat other arrays as dense, too. However, the testing I&#8217;ve done suggests that it&#8217;s not quite as simple as this. Contrary to the blog&#8217;s guidelines, I found that:</p>
<ul>
<li>Creating the array with an explicit size (whether or not followed by initializing the elements) <em><strong>had no measurable impact</strong></em> on the speed of the <code>trim</code> method.
</li>
<li>Initializing the lookup table&#8217;s full range of 12,289 elements (the vast majority of which are not whitespace) <em><strong>did</strong></em> improve performance.
</li>
</ul>
<p>This finding is specific to the trim method, and so these results <strong>should not be used as general performance guidelines.</strong> The usage pattern for a particular array can greatly affect its performance profile. I&#8217;ve found unrelated cases where specifying the array&#8217;s size could help <em><strong>or</strong></em> hurt performance. (I&#8217;ll write more about this in my next post.)</p>
<p>Following up on these hints led to a <a href="#jsarray-trim18-implementation">new version of </a><a href="#jsarray-trim18-implementation"><code>trim</code></a> that&#8217;s simpler and faster than my previous effort. This version is actually more closely related to the <code><a href="http://yesudeep.wordpress.com/2009/07/31/even-faster-string-prototype-trim-implementation-in-javascript/">trim17</a></code> method from Yesudeep Mangalapilly than it is to <code>trimOOWB</code>. Hence I&#8217;ve named this newer one <code>trim18</code> to reaffirm its heritage. (In Steve Levithan&#8217;s original post on <a href="http://blog.stevenlevithan.com/archives/faster-trim-javascript">JavaScript trim methods</a>, he assigned numbers to each of the <code>trim</code> implementations that he examined. Yesudeep took up that naming scheme, and now I&#8217;m doing so as well.)</p>
<div class="oowbbtw">
<h3>An Intermediary Thought</h3>
<p>Remember that premature optimization is folly, and all optimization has its costs. In this case, we&#8217;re buying performance by fully initializing a 12K JavaScript array. If the interpreter stores this using a hash table <em>as well as</em> an array, then the table will likely consume over 36K of memory. Is the benefit worth the cost? It could be, especially if we&#8217;re not running on a cell phone. But <strong>please</strong> don&#8217;t take away from this that you should initialize <strong>every</strong> array that you create. That would ultimately be self-defeating: One of the worst things you can do for performance is to consume more memory than you really need.</div>
<h2>Performance</h2>
<p>Since the last post, I&#8217;ve revised the benchmark for better precision. The structure is the same—both benchmarks call the three <code>trim</code> methods repeatedly with a fixed set of input data—but in the newer benchmark, the input strings are much longer, and the number of iterations is lower. This should yield more precise measurements of execution time. All the same, these results should be used with care; as is often said in the U.S.A, your mileage will vary. </p>
<p>The benchmark results are shown below. Please remember that these tables are not directly comparable to the numbers in my previous post.</p>
<h3>ASCII data (only spaces and tabs used for whitespace)</h3>
<table style="" border="0" cellspacing="0" cellpadding="2" >
<tbody>
<tr class="oowbfirstRow">
<th style="text-align: right;"></th>
<th style="text-align: right;" ><strong><code>trim17</code></strong></th>
<th style="text-align: right;" ><strong><code>trimOOWB</code></strong></th>
<th style="text-align: right;" ><strong><code>% saved vs. trim17</code></strong></th>
<th style="text-align: right;" ><strong><code>trim18</code></strong></th>
<th style="text-align: right;" ><strong><code>% saved vs. trim17</code></strong></th>
</tr>
<tr>
<td><strong>Internet Explorer 8</strong></td>
<td style="text-align: right;">74,453</td>
<td style="text-align: right;">69,609</td>
<td style="text-align: right;">6.5</td>
<td style="text-align: right;" >69,922</td>
<td style="text-align: right;" >6.1</td>
</tr>
<tr>
<td><strong>Firefox 3.5</strong></td>
<td style="text-align: right;">6,776</td>
<td style="text-align: right;">3,732</td>
<td style="text-align: right;">44.9</td>
<td style="text-align: right;" >3,003</td>
<td style="text-align: right;" >55.7</td>
</tr>
<tr>
<td><strong>Google Chrome 3.0</strong></td>
<td style="text-align: right;">2,530</td>
<td style="text-align: right;">824</td>
<td style="text-align: right;">67.4</td>
<td style="text-align: right;" >754</td>
<td style="text-align: right;" >70.2</td>
</tr>
</tbody>
</table>
<h3>Unicode data (using all ASCII and Unicode whitespace characters)</h3>
<table style="" border="0" cellspacing="0" cellpadding="2" >
<tbody>
<tr class="oowbfirstRow">
<th style="text-align: right;"></th>
<th style="text-align: right;" ><strong><code>trim17</code></strong></th>
<th style="text-align: right;" ><strong><code>trimOOWB</code></strong></th>
<th style="text-align: right;" ><strong><code>% saved vs. trim17</code></strong></th>
<th style="text-align: right;" ><strong><code>trim18</code></strong></th>
<th style="text-align: right;" ><strong><code>% saved vs. trim17</code></strong></th>
</tr>
<tr>
<td><strong>Internet Explorer 8</strong></td>
<td style="text-align: right;">76,188</td>
<td style="text-align: right;">74,468</td>
<td style="text-align: right;">2.3</td>
<td style="text-align: right;" >72,484</td>
<td style="text-align: right;" >4.9</td>
</tr>
<tr>
<td><strong>Firefox 3.5</strong></td>
<td style="text-align: right;">6,779</td>
<td style="text-align: right;">5,025</td>
<td style="text-align: right;">25.6</td>
<td style="text-align: right;" >3,029</td>
<td style="text-align: right;" >55.3</td>
</tr>
<tr>
<td><strong>Google Chrome 3.0</strong></td>
<td style="text-align: right;">2,780</td>
<td style="text-align: right;">940</td>
<td style="text-align: right;">66.2</td>
<td style="text-align: right;" >810</td>
<td style="text-align: right;" >70.9</td>
</tr>
</tbody>
</table>
<p><a name="jsarray-trim18-implementation"><br />
<h2>trim18 implementation</h2>
<p></a><br />
<em>I left the explicit size in the array constructor call, even though I&#8217;d found no benefit from using it. It seems unlikely to cause any harm, and there may be environments where this method&#8217;s performance might benefit from it.</em></p>
<pre class="brush: jscript;">
var trim18 = (function() {

    var tableSize = 0x3000 + 1;
    var whiteSpace = new Array(tableSize);

    // Initialize the array elements before populating the data.
    // (This may help performance, by hinting to the interpreter that
    //  the array should not be managed as a sparse array.)

    for (var i = 0; i &lt; tableSize; i++) {
        whiteSpace[i] = false;
    }

    whiteSpace[0x0009] = true;  whiteSpace[0x000a] = true;
    whiteSpace[0x000b] = true;  whiteSpace[0x000c] = true;
    whiteSpace[0x000d] = true;  whiteSpace[0x0020] = true;
    whiteSpace[0x0085] = true;  whiteSpace[0x00a0] = true;
    whiteSpace[0x1680] = true;  whiteSpace[0x180e] = true;
    whiteSpace[0x2000] = true;  whiteSpace[0x2001] = true;
    whiteSpace[0x2002] = true;  whiteSpace[0x2003] = true;
    whiteSpace[0x2004] = true;  whiteSpace[0x2005] = true;
    whiteSpace[0x2006] = true;  whiteSpace[0x2007] = true;
    whiteSpace[0x2008] = true;  whiteSpace[0x2009] = true;
    whiteSpace[0x200a] = true;  whiteSpace[0x200b] = true;
    whiteSpace[0x2028] = true;  whiteSpace[0x2029] = true;
    whiteSpace[0x202f] = true;  whiteSpace[0x205f] = true;
    whiteSpace[0x3000] = true;

    function trim18(str) {
        var len = str.length, ws = whiteSpace, i = 0;
        while (ws[str.charCodeAt(--len)]);
        if (++len){
            while (ws[str.charCodeAt(i)]){ ++i; }
        }
        return str.substring(i, len);
    }

    return trim18;
})();
</pre>
<hr />
<div><sup><a name="jsarray-init-note-1" href="#jsarray-init-ref-1">1</a></sup> It&#8217;s also relevant that <em>every</em> JavaScript object needs a hash table to manage properties. Hence sparse arrays can be implemented easily by using this hash table for the same purpose. (Considering the design of JavaScript&#8217;s <code>for...in</code> loop statement, it looks as if the language designers intended this.)</div>
<div><sup><a name="jsarray-init-note-2" href="#jsarray-init-ref-2">2</a></sup> That is, you would need to initialize the array if you otherwise won&#8217;t be populating all of its elements, or if you won&#8217;t be populating them in strict order starting from 0. On the other hand, if your script would normally add data to the array starting from index 0 and working up from there, leaving no gaps, then you&#8217;re already squared away with IE8&#8217;s heuristics.</div>
]]></content:encoded>
			<wfw:commentRss>http://www.outofwhatbox.com/blog/2009/11/javascript-array-performance-initialize-to-optimize/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Trimming trim via razing arrays (JavaScript)</title>
		<link>http://www.outofwhatbox.com/blog/2009/11/trimming-trim-via-razing-arrays-javascript/</link>
		<comments>http://www.outofwhatbox.com/blog/2009/11/trimming-trim-via-razing-arrays-javascript/#comments</comments>
		<pubDate>Thu, 05 Nov 2009 01:08:53 +0000</pubDate>
		<dc:creator>Dan Breslau</dc:creator>
				<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[JavaScript arrays]]></category>
		<category><![CDATA[JavaScript performance]]></category>

		<guid isPermaLink="false">http://www.outofwhatbox.com/blog/?p=576</guid>
		<description><![CDATA[It may well be that for better JavaScript performance, large arrays should be avoided where possible. That seems fairly clear in this case, where an already fast implementation of the missing String.trim() method was made even faster by using much smaller lookup table arrays.]]></description>
			<content:encoded><![CDATA[<p>Back in 2007, Steve Levithan <a href="http://blog.stevenlevithan.com/archives/faster-trim-javascript">compared the speed</a> of different implementations for the missing JavaScript <code>String.trim()</code> function. Steve&#8217;s blog post has launched The Comment Thread That Will Not Die, as a number of folks have been tempted to try their hand at writing their own implementation.</p>
<p>Count me in.</p>
<p>It started when <a href="http://yesudeep.wordpress.com/2009/07/31/even-faster-string-prototype-trim-implementation-in-javascript/">Yesudeep Mangalapilly&#8217;s version</a> caught my attention. Yesudeep, working with an idea from <a href="http://blog.stevenlevithan.com/archives/faster-trim-javascript#comment-25052">Michael Lee Finney</a>, had a fast implementation that didn&#8217;t use regular expressions. Instead of regexps, Yesudeep&#8217;s and Michael&#8217;s versions scan the string one character at a time, from the front and back ends, checking each character against a lookup table to determine if it&#8217;s whitespace.</p>
<p>However: The largest Unicode code point that&#8217;s counted as whitespace is <a href="http://unicode.org/charts/PDF/U3000.pdf">U+3000 <em>(pdf)</em></a> (12288 in decimal), the <a href="http://en.wikipedia.org/wiki/Space_%28punctuation%29">Ideographic Space</a> character. Hence, the lookup table array in Michael&#8217;s and Yasudeep&#8217;s implementations has a length of 12289, with most entries undefined. That&#8217;s a pretty large array, and a pretty sparse one.</p>
<p>Even though these were already among the fastest of the <code>trim</code>s, I wondered if using a large array as a lookup table might carry any performance penalty. My concern stemmed from the fact that <a href="https://developer.mozilla.org/en/Core_JavaScript_1.5_Reference/Objects/Array#Increasing_the_array_length_indirectly">JavaScript arrays grow dynamically</a>, adjusting in size to hold the highest index assigned into them. This poses a challenge to the interpreter: If it always places an array in a linear block of memory (as in C++), then accommodating array growth is likely to be a problem. So, to allow for reasonable performance at the array grows, the interpreter might not use a linear storage model for arrays. Non-linear models (trees or linked lists, for example) may make random access to the array slower, but would allow for reasonable performance when growing the array, while exhibiting reasonable memory consumption<sup><a name="trim-ref-mem" href="#trim-note-mem">1</a></sup>.</p>
<p>Through testing in three popular browsers, I found reason to be concerned about large arrays. I profiled the <code>trim17</code> method (Yasudeep&#8217;s implementation), using input strings that contained only spaces and tabs for whitespace. After this profiling run, <a name="ReducedArraySize">I trimmed <code>trim17</code>&#8217;s lookup table</a>—hacked it, really—by removing all entries above U+0020, and so limiting it to recognizing only ASCII whitespace chars. Then I profiled it again. </p>
<p>The table below shows the milliseconds spent within the two versions of <code>trim17</code>; the difference between their runtimes is most likely due to the change in size of the lookup table array.</p>
<table style="height: 77px;" border="0" cellspacing="0" cellpadding="0" width="517">
<tbody>
<tr class="oowbfirstRow">
<th></th>
<th style="text-align: right;" ><strong>Original <code>trim17</code></strong></th>
<th style="text-align: right;" ><strong>Reduced <code>trim17</code></strong></th>
<th style="text-align: right;" ><strong>% Saved</strong></th>
</tr>
<tr>
<td><strong>Internet Explorer 8</strong></td>
<td>
<p align="right">27,328</p>
</td>
<td>
<p align="right">20,281</p>
</td>
<td>
<p align="right">26</p>
</td>
</tr>
<tr>
<td><strong>Firefox 3.5</strong></td>
<td>
<p align="right">3,689</p>
</td>
<td>
<p align="right">2,978</p>
</td>
<td>
<p align="right">20</p>
</td>
</tr>
<tr>
<td><strong>Chrome 3.0</strong></td>
<td>
<p align="right">610</p>
</td>
<td>
<p align="right">191</p>
</td>
<td>
<p align="right">69</p>
</td>
</tr>
</tbody>
</table>
<p>These results confirm that, in JavaScript, accessing larger arrays can be slower than accessing smaller arrays.</p>
<hr style="margin:1em 0"/>
<p>But, perhaps applying these results to all arrays is an overgeneralization. Ideally, at least, it should be possible for an interpreter to recognize a &#8220;read-only&#8221; array, and use a more efficient layout for it. That is, if the interpreter can verify that an array isn&#8217;t modified after its initial construction, then perhaps it can safely flatten out the array into a linear block of memory.</p>
<p>The array in <code>trim17</code> was constructed as a property of the <code>String</code> prototype. An array couldn&#8217;t be more modifiable than that, and so I wouldn&#8217;t expect it to be flattened by the interpreter. But suppose the array were accessible only from within a single function (a <a href="https://developer.mozilla.org/en/Core_JavaScript_1.5_Guide/Working_with_Closures">closure</a>). In that case, if that single function isn&#8217;t modifying the array, we know that nothing will. Depending on the JavaScript interpreter, that might allow for better performance.</p>
<p>I changed the code accordingly, but the actual improvement in speed was&#8230; unremarkable. Nonexistent, even. It may have eked out around a 5% gain in some tests, but there&#8217;s enough noise in the measurements that it&#8217;s hard to be sure. Still, I dislike globals (and, especially, modifiable globals), so I decided to stick with the closure. (Besides, maybe someday, somewhere, an optimizing interpreter will know how to make use of it.)</p>
<p>The next approach was to try using a smaller array. Or, rather, two smaller arrays: One to represent the whitespace characters at and below U+0020, and another to represent the whitespace characters between U+2000 and U+205f. To keep the second array small, its indices are offset by <code>–0x2000</code>; this gives it a size of <code>0x0060</code> (96 decimal) entries. (There are three whitespace values that aren&#8217;t in either array: U+1680, U+180e, and U+3000. The new code checks for these explicitly.)</p>
<p>Even with smaller arrays, I found that random access into them is still slower than making a few comparisons on scalar variables. Hence the code is written so that, for any character value, it consults no more than one of the two lookup tables, and then only if the character is in a reasonable range for that table.</p>
<p>Here&#8217;s the new method:</p>
<pre class="brush: jscript;">
var trimOOWB = (function() {

 var whiteSpace = new Array(0x00a0 + 1);
 whiteSpace[0x0009] = true;    whiteSpace[0x000a] = true;
 whiteSpace[0x000b] = true;    whiteSpace[0x000c] = true;
 whiteSpace[0x000d] = true;    whiteSpace[0x0020] = true;
 whiteSpace[0x0085] = true;    whiteSpace[0x00a0] = true;

 var whiteSpace2 = new Array(0x005f + 1);
 var base = 0x2000;
 whiteSpace2[0x2000 - base] = true;  whiteSpace2[0x2001 - base] = true;
 whiteSpace2[0x2002 - base] = true;  whiteSpace2[0x2003 - base] = true;
 whiteSpace2[0x2004 - base] = true;  whiteSpace2[0x2005 - base] = true;
 whiteSpace2[0x2006 - base] = true;  whiteSpace2[0x2007 - base] = true;
 whiteSpace2[0x2008 - base] = true;  whiteSpace2[0x2009 - base] = true;
 whiteSpace2[0x200a - base] = true;  whiteSpace2[0x200b - base] = true;
 whiteSpace2[0x2028 - base] = true;  whiteSpace2[0x2029 - base] = true;
 whiteSpace2[0x202f - base] = true;  whiteSpace2[0x205f - base] = true;

    function trimOOWB2(str) {
        var ws = whiteSpace, ws2 = whiteSpace2;
        var i=0, len=str.length, ch;
        while ((ch = str.charCodeAt(--len)) &amp;amp;&amp;amp;
               (ch &amp;lt;= 0x00A0 ? ws[ch] :
                (ch &amp;gt;= 0x2000 ? (ch===0x3000 || ws2[ch - 0x2000])
                 : (ch===0x1680 || ch===0x180e ))))
            ;

        if (++len) {
            while ((ch = str.charCodeAt(i)) &amp;amp;&amp;amp;
                   (ch &amp;lt;= 0x00A0 ? ws[ch] :
                    (ch &amp;gt;= 0x2000 ? (ch===0x3000 || ws2[ch - 0x2000])
                     : (ch===0x1680 || ch===0x180e )))) {
                ++i;
            }
        }
        return str.substring(i, len);
    }

    return trimOOWB2;
})();
</pre>
<p>I benchmarked the new function (<code>trimOOWB</code>) and Yesudeep&#8217;s <code>trim17</code> function, using two sets of input strings. In the first test, only legacy ASCII whitespace characters (e.g., spaces and tabs) were used, as in the test above. The second test data set used the full set of whitespace in the Unicode character set. The numbers shown are milliseconds spent within the trim functions; lower is better.</p>
<table style="" border="0" cellspacing="0" cellpadding="2" >
<tbody>
<tr class="oowbfirstRow">
<th style="text-align: right;"></th>
<th style="text-align: right;" ><strong><code>trim17 (ASCII)</code></strong></th>
<th style="text-align: right;" ><strong><code>trimOOWB (ASCII)</code></strong></th>
<th style="text-align: right;" ><strong><code>% saved</code></strong></th>
<th style="text-align: right;" ><strong><code>trim17 (Unicode)</code></strong></th>
<th style="text-align: right;" ><strong><code>trimOOWB (Unicode)</code></strong></th>
<th style="text-align: right;" ><strong><code>% saved</code></strong></th>
</tr>
<tr>
<td><strong>Internet Explorer 8</strong></td>
<td style="text-align: right;">25,953</td>
<td style="text-align: right;">22,359</td>
<td style="text-align: right;">14</td>
<td style="text-align: right;" >27,469</td>
<td style="text-align: right;" >26,500</td>
<td style="text-align: right;" >4</td>
</tr>
<tr>
<td><strong>Firefox 3.5</strong></td>
<td style="text-align: right;">3,706</td>
<td style="text-align: right;">3,089</td>
<td style="text-align: right;">17</td>
<td style="text-align: right;" >3,831</td>
<td style="text-align: right;" >3,439</td>
<td style="text-align: right;" >10</td>
</tr>
<tr>
<td><strong>Google Chrome 3.0</strong></td>
<td style="text-align: right;">604</td>
<td style="text-align: right;">139</td>
<td style="text-align: right;">77</td>
<td style="text-align: right;" >664</td>
<td style="text-align: right;" >166</td>
<td style="text-align: right;" >75</td>
</tr>
</tbody>
</table>
<h3>A final thought</h3>
<p>It&#8217;s interesting that there&#8217;s a direct correlation between the base time for the browser to execute <code>trim17</code>, and the percentage of time saved by <code>trimOOWB</code>. In other words, the <em>percentage</em> of performance boost from using smaller arrays increases as the browser&#8217;s speed increases: IE showed the highest time and lowest gain, followed by Firefox on both counts, and finishing with Chrome, which had the lowest base time and the highest percentage gain.</p>
<p>I&#8217;m guessing here, but I think the easiest way to explain this is that all three JavaScript interpreters are using roughly equivalent strategies for managing arrays. The percentages gained are different because the same <em>absolute</em> time savings in Chrome results in a higher <em>relative</em> performance boost when compared to IE or Firefox.</p>
<p>That raises a question, though: Is it possible that the structure of <code>trimOOWB</code> gives it any performance advantage over <code>trim17</code> <em>aside from</em> the savings generated through reducing the array size? I&#8217;ve looked for such artifacts in the tests. In short, and skipping the details for now, I think it&#8217;s likely that such artifacts couldn&#8217;t account for more than one quarter of the overall speed boost; it&#8217;s probably much less than that. It&#8217;s at least as likely that the overhead added by <code>trimOOWB</code> is <em>obscuring</em> part of the overall performance boost.</p>
<h3>Another final thought</h3>
<p>The multiple comparisons made in <code>trimOOWB</code> might raise the question: Why not try using a <code>switch</code> statement instead of all those conditionals? Well, I <em>did</em> try this, with mixed results. On one hand, Firefox showed a significant speedup, around 25%. On the other hand, IE may have been a little slower, and Chrome was more than twice as slow. (Besides, it was a <code>switch</code> statement. We&#8217;re looking for speed, but a fellow&#8217;s got to have <em>some</em> standards.)</p>
<h3>A final final thought</h3>
<p>Using a closure offers another advantage: It&#8217;s simple to redirect calls to <code>trim</code> to the native String.trim() function, if the browser supports it. All that&#8217;s required is to change the <code>return</code> in the outer (anonymous) function from this:</p>
<pre class="brush: jscript;">
    return trimOOWB2;
</pre>
<p>to:</p>
<pre class="brush: jscript;">
    return String.trim || trimOOWB2;
</pre>
<p><a href="http://ejohn.org/blog/ecmascript-5-strict-mode-json-and-more/">ECMAScript 5.0 is slated</a> to include <code>String.trim</code>, so it&#8217;s probably worth thinking ahead for this. In Firefox 3.5—the only current browser that I know of that supports <code>String.trim</code>—calls to the native <code>String.trim</code> run in a fraction of the time of any of the JavaScript implementations.</p>
<div class="oowbbtw"><strong>Note</strong>: Firefox&#8217;s native implementation of <code>String.trim</code> does not count U+1680 or U+180E as whitespace. It does treat U+3000 as whitespace.</div>
<hr />
<div><sup><a name="trim-note-mem" href="#trim-ref-mem">1</a></sup>Yes, I wrote &#8220;JavaScript&#8221; and &#8220;reasonable memory consumption&#8221; in the same article. Go ahead and snicker.</div>
<p />
]]></content:encoded>
			<wfw:commentRss>http://www.outofwhatbox.com/blog/2009/11/trimming-trim-via-razing-arrays-javascript/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Working Consciously In The Dish Room</title>
		<link>http://www.outofwhatbox.com/blog/2009/09/working-consciously-in-the-dish-room/</link>
		<comments>http://www.outofwhatbox.com/blog/2009/09/working-consciously-in-the-dish-room/#comments</comments>
		<pubDate>Mon, 21 Sep 2009 16:41:34 +0000</pubDate>
		<dc:creator>Dan Breslau</dc:creator>
				<category><![CDATA[Life]]></category>
		<category><![CDATA[working consciously]]></category>

		<guid isPermaLink="false">http://www.outofwhatbox.com/blog/?p=565</guid>
		<description><![CDATA[To the naive observer, washing dishes doesn't look like a very glamorous occupation. A closer inspection reveals just how very correct this naive observation is. But even a dish room can offer interesting challenges, if you remember to look for them.]]></description>
			<content:encoded><![CDATA[<p><i>Rack. Rinse. Wash. Stack.</i></p>
<p>I recently spent some time working as a dishwasher at a camp. No, I wasn&#8217;t sampling a possible new career. Truth be told, I was volunteering, and I enjoyed the work.</p>
<p>If that raised the brow over your mind&#8217;s eye, you&#8217;re not alone. Telling folks that I enjoyed this work raised any number of brows. To the naive observer, washing dishes doesn&#8217;t look like a very glamorous occupation. A closer inspection reveals just how very correct this naive observation is. Yet I drew myself into the task through <i>working consciously</i>—a term that I previously used in <a href="http://www.outofwhatbox.com/blog/2009/04/how_to_make_mistakes/">How To Make Mistakes</a>, but never fully defined.</p>
<p>Working consciously makes mindful work of the mindless. It involves constantly challenging oneself to be more effective. It requires continuous awareness of the work, an ongoing search for better approaches to the task, and the willingness to try different methods and compare the results. It means, in short, attaining satisfaction by not being satisfied.</p>
<p>But I still haven&#8217;t defined it very well. Perhaps this example will help.</p>
<p><i> Rack. Rinse. Wash. Stack.</i></p>
<p>At its most basic, the dishwashing process comprises four tasks:
<ul>
<li>Dirty dishes are placed into racks.</li>
<li>The racks are sprayed, removing any excess bits of food.</li>
<li>The racks are run through an industrial-style dishwashing machine (the Hobart). </li>
<li>After coming out of the Hobart, the newly clean dishes are stacked on their shelves. </li>
</ul>
<p> When dirty dishes are coming at a fast and furious pace, we need to keep the pipeline flowing, or we&#8217;d get overrun with dirty dishes. <i>Rack. Rinse. Wash. Stack.</i> The racks with pegs are for plates and bowls; there&#8217;s another kind for beverage cups; flats are for utensils and various other implements of mass consumption.</p>
<p>Loading each rack to its capacity allows for fewer runs of the Hobart, saving water and energy. But fully loading the racks requires time and attention; too much of either creates a bottleneck. So there&#8217;s a bit of a Tetris-like challenge in loading the racks, except that the game doesn&#8217;t end just because you&#8217;re too stacked up.</p>
<p>The Hobart cycle takes two minutes. If the racks aren&#8217;t packed well, then the queue of racks going into the Hobart gets stretched out, making the Hobart another potential bottleneck. Yet the Hobart runs for <i>only</i> two minutes, not enough time to thoroughly scour dirty dishes. Hence we pre-rinse so that the dishes don&#8217;t come out of the Hobart needing another run through it (or maybe two). However, more thorough rinsing may require more time and more hot water. And sometimes a very quick scrub works better than a rinse, saving water—but scrubbing can be slower than rinsing.</p>
<p>Working consciously helped me to recognize these constraints; it also helped me consider and test various ways to manage them. As one example, I tried varying the orientation of the dishes in the racks; I saw that an edge-on angle let me rinse them more effectively. In the same vein, I sought other ways to minimize how much water I used when rinsing, and to recognize when the better tool was the scrub sponge and not the sprayer.</p>
<p>Through working consciously, I turned what could have been dull work into a win for all: Our diners had clean dishes; the task was done more efficiently; and I gained the satisfaction of seeing my work improving (along with my Tetris game). Plus, I turned a dishwashing gig into a blog post.</p>
<p>Even a dish room can offer interesting challenges, if you remember to look for them.</p>
<p><i>Rack. Rinse. Wash. Stack.</i></p>
]]></content:encoded>
			<wfw:commentRss>http://www.outofwhatbox.com/blog/2009/09/working-consciously-in-the-dish-room/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Filling The Cracks</title>
		<link>http://www.outofwhatbox.com/blog/2009/07/filling-the-cracks/</link>
		<comments>http://www.outofwhatbox.com/blog/2009/07/filling-the-cracks/#comments</comments>
		<pubDate>Wed, 22 Jul 2009 22:19:37 +0000</pubDate>
		<dc:creator>Dan Breslau</dc:creator>
				<category><![CDATA[Web]]></category>
		<category><![CDATA[CSS]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[syntaxhighlighter]]></category>

		<guid isPermaLink="false">http://www.outofwhatbox.com/blog/?p=515</guid>
		<description><![CDATA[Sometimes the best way to solve a problem is to make it no longer <em>be</em> a problem. This was the case with a problem that I'd created -- and have now fixed -- in my version of SyntaxHighlighter.]]></description>
			<content:encoded><![CDATA[<div class="oowbbtw">The <strong>latest</strong> posted version of SyntaxHighlighter is <a href="http://static.outofwhatbox.com/downloads/SyntaxHighlighter/syntaxhighlighter.zip">here</a>. For the change history, or to download older versions from this site, see the <a href="http://www.outofwhatbox.com/blog/syntaxhighlighter-downloads/">downloads page</a>.</div>
<div class="alignleft" style="width: 190px"><img alt="Mind The Gap!" src="http://img.outofwhatbox.com/shGutter/gap.jpg" title="Photo from http://www.flickr.com/photos/jvk/387951549/ )" width="180" height="135" /></div>
<p><a href="http://alexgorbatchev.com/wiki/SyntaxHighlighter">SyntaxHighlighter</a> can display line numbers on the left side of the code block, in an area called the <em>gutter</em>. As a result of other changes that I&#8217;d made, the gutter sometimes showed gaps between the line number elements. I added some JavaScript to resize the number elements so that they&#8217;d be contiguous, closing any gaps. However, that code needs the <code>offsetHeight</code> for each line in the code block. As I then discovered, when JavaScript so much as <strong>reads</strong> the <code>offsetHeight</code> of an element, it&#8217;s likely that the browser will need to <a href="http://www.stubbornella.org/content/2009/03/27/reflows-repaints-css-performance-making-your-javascript-slow/">reflow the page</a> to calculate the value. Repeat that for enough lines, and you&#8217;ve got a bit of a performance issue (or I do, depending on how you look at it.)</p>
<p>At one point, I&#8217;d thought that I could bypass the height adjustment for the most common cases (e.g., text lines that are short enough not to wrap.) But life and HTML come with few guarantees; testing showed that some browsers, in some environments, would still produce some 1-pixel gaps like so:</p>
<div id="attachment_179" class="wp-caption alignnone" style="width: 483px"><img class="size-full wp-image-179" title="BrokenGutter" src="http://img.outofwhatbox.com/shGutter/BrokenGutter.png" alt="Gutter, on the left, showing unwanted white lines." width="473" height="107" /><p class="wp-caption-text">Gutter, left, showing unwanted white lines.  <em>(The text on the right is blurred; do not adjust your eyes.)</em></p></div>
<p>(The gaps appeared below lines that had keywords highlighted in boldface fonts.  Depending on the fonts that the browser is using, the boldface can sometimes add a pixel to the height of the text line.)</p>
<p>Sometimes the best way to solve a problem is to make it no longer <em>be</em> a problem. I&#8217;d caused the gaps to appear by changing how backgrounds are rendered. In the original SyntaxHighlighter, the number elements are transparent; the gutter&#8217;s background is the background of the top-level <code>&lt;div></code> for the entire code block. There may be gaps between the number elements, but those gaps <em>can&#8217;t be seen</em>. I couldn&#8217;t easily go back to that approach, but thinking about it inspired a new one: By stretching the first gutter element vertically, it could be used to provide the background for the gutter. The remaining gutter elements would then be laid out over that first one. Here&#8217;s an illustration:</p>
<div id="attachment_179" class="wp-caption alignnone" style="width: 483px"><img class="size-full wp-image-179" title="BrokenGutter" src="http://img.outofwhatbox.com/shGutter/GutterHow.png" alt="The first number element fills the gutter vertically; the remaining elements are placed on top of it, leaving no visible gaps." width="473" height="107" /><p class="wp-caption-text">The first number element fills the gutter vertically; the remaining elements are placed on top of it, leaving no <em>visible</em> gaps.</p></div>
<p>The gaps are invisible once more.  The associated JavaScript still causes some reflow, but no more than five times per code block, a much smaller overhead in most cases.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.outofwhatbox.com/blog/2009/07/filling-the-cracks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Dumping Hack Flash, It&#8217;s a Class, Class, Class</title>
		<link>http://www.outofwhatbox.com/blog/2009/07/dumping-hack-flash-its-a-class-class-class/</link>
		<comments>http://www.outofwhatbox.com/blog/2009/07/dumping-hack-flash-its-a-class-class-class/#comments</comments>
		<pubDate>Fri, 10 Jul 2009 23:37:49 +0000</pubDate>
		<dc:creator>Dan Breslau</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[CSS]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[syntaxhighlighter]]></category>

		<guid isPermaLink="false">http://www.outofwhatbox.com/blog/?p=499</guid>
		<description><![CDATA[SyntaxHighlighter now preserves whitespace. Kind of. It's not pure WYSIWYG, but it's YCAGWYWBIYTSYJMFYGWYN. It's good enough that we can stop using Flash to access the clipboard.]]></description>
			<content:encoded><![CDATA[<div class="oowbbtw">The <strong>latest</strong> posted version of SyntaxHighlighter is <a href="http://static.outofwhatbox.com/downloads/SyntaxHighlighter/syntaxhighlighter.zip">here</a>. For the change history, or to download older versions from this site, see the <a href="http://www.outofwhatbox.com/blog/syntaxhighlighter-downloads/">downloads page</a>.</div>
<p>Well, that was&#8230; interesting.</p>
<p>I&#8217;ve been working on improvements in <a href="http://alexgorbatchev.com/wiki/SyntaxHighlighter">SyntaxHighlighter</a>&#8217;s whitespace handling. My goal was to have whitespace presented exactly as it&#8217;s supplied within the original HTML. I thought this would be helpful on two fronts: First, it might fix a few minor display formatting issues that I&#8217;d found. Second, and of greater concern to me, I figured that if SyntaxHighlighter preserved whitespace well enough, then it would be OK to remove the Flash-based &#8220;Copy&#8221; applet from the toolbar. (I&#8217;m not sure, but my guess is that the Flash app was added <em>because</em> copying the code to the clipboard has been so problematic.)</p>
<p>I&#8217;ll admit that its removal is a mixed blessing: It&#8217;s not like it&#8217;s a bad thing, to be able to copy the code to the clipboard with one click of the mouse. But the Flash applet carries some baggage, too: It requires Flash (well, yeah); it adds to the overall footprint consumed by SyntaxHighlighter (especially when using Internet Explorer); and it doesn&#8217;t always work.</p>
<p>I don&#8217;t have a lot of details on that last one; then again, I haven&#8217;t sought details. But the problem could simply be that <a href="http://www.adobe.com/devnet/flashplayer/articles/fplayer10_uia_requirements.html">Flash 10 added new restrictions</a> on the application&#8217;s access to the Clipboard, thereby breaking it for SyntaxHighlighter.</p>
<p>Maybe, maybe not. But if the traditional select-and-copy operation worked as expected, without loss of newlines or other whitespace, would there really be any further <strong>need</strong> to keep the Flash script? I didn&#8217;t think so either.</p>
<p>I think I can hear some of you snickering. No, I hadn&#8217;t ever gone down this road before. So, I&#8217;ve discovered that perfect whitespace preservation in HTML is not easy. It may not even be <strong>possible</strong>, even setting tab characters aside.</p>
<p>I&#8217;d hoped to come up with a broadly cross-browser-compatible scheme for preserving whitespace in the display and in the paste buffer. What I have is a combination of JavaScript and CSS classes that fake it pretty well (I think.) The key, of course, lies in using the CSS <code>whitespace="pre"</code> and <code>whitespace="pre-wrap"</code> directives in the appropriate classes. JavaScript is needed, among other things, to ensure that line breaks come through correctly. And yes, I did have to break into browser-specfic code at times.</p>
<p>Surprising things can happen to space characters &#8216;twixt the HTTP stack and the paste buffer. Throw in newlines and tabs <em>(On second thought: no, don&#8217;t throw in tabs; <a href="http://www.w3.org/TR/html401/struct/text.html#idx-text-1">throw them out</a>)</em>, and life gets very interesting indeed.</p>
<p>While the resulting release doesn&#8217;t quite fit the criteria for being <a href="http://en.wikipedia.org/wiki/WYSIWYG">WYSIWYG</a>, I&#8217;d say it does at least satisfy <a href="http://en.wikipedia.org/wiki/You_can%27t_always_get_what_you_want">YCAGWYWBIYTSYJMFYGWYN</a>. Tab characters are converted to spaces, and there are often one or two spaces added at the ends of lines. I&#8217;m guessing that this will be close enough for most uses. For those cases where exactness is required, there&#8217;s still the toolbar command for view the original source as plain text. The popped-up text preserves the original whitespace, and a &#8220;copy&#8221; works just fine there. Putting this all together, it did seem to me that the Flash app could now be dropped, and so I&#8217;ve done so.</p>
<p>I&#8217;ve also added some tentative support for titles. Unfortunately, I can&#8217;t show an example in this post, because the <a href="http://www.viper007bond.com/category/wordpress/my-wp-plugins/syntaxhighlighter/">WordPress Plugin</a> doesn&#8217;t support the option yet; but you can see an example <a href="http://static.outofwhatbox.com/shWhitespace/testTitle.html">here</a>. (I&#8217;d thought about adding this previously, then discarded it as creeping featurism, and then re-considered it when I realized that it might be of use for the visually disabled.)</p>
<p>And there are a lot of changes under the hood; I&#8217;ll write about these in separate posts. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.outofwhatbox.com/blog/2009/07/dumping-hack-flash-its-a-class-class-class/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Occam&#8217;s Razor and Brushes</title>
		<link>http://www.outofwhatbox.com/blog/2009/06/occams-razor-and-brushes/</link>
		<comments>http://www.outofwhatbox.com/blog/2009/06/occams-razor-and-brushes/#comments</comments>
		<pubDate>Mon, 29 Jun 2009 17:22:12 +0000</pubDate>
		<dc:creator>Dan Breslau</dc:creator>
				<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[Programming Languages]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[syntaxhighlighter]]></category>

		<guid isPermaLink="false">http://www.outofwhatbox.com/blog/?p=486</guid>
		<description><![CDATA[In software engineering, Occam's Razor has a corollary: When entities are multiplying, find out how they became necessary. An issue in SyntaxHighlighter illustrates this principle.]]></description>
			<content:encoded><![CDATA[<div class="oowbbtw">There&#8217;s no new ZIP file for this post.  I have yet another set of changes in the works, which should be ready shortly. I&#8217;ll post a ZIP file then, with both sets of updates.</div>
<p><img class="oowbleft" src="http://img.outofwhatbox.com/OccamsBrush/razorbrush.png" alt="Razor and brush, from amazingshaving.com/Merchant2/graphics/00000001/MPS281A820.jpg" /><br />
Alex Gorbatchev&#8217;s <a href="http://alexgorbatchev.com/wiki/SyntaxHighlighter">SyntaxHighlighter</a> uses separate modules (<em>Brushes</em>) to process the syntax for individual programming languages. A web page that uses SyntaxHighlighter must load the main JavaScript file (shCore.js), and load the required brush files. After the brushes are loaded, the page sets any desired configuration parameters for the highlighter, then invokes <code>SyntaxHighlighter.all</code>, which performs the formatting and highlighting.</p>
<p>SyntaxHighlighter currently includes 21 brushes, each in its own JavaScript file. A web page author may wish to load only those brush files that are needed for a given page. <a href="http://davidchambersdesign.com/">David Chambers</a> addressed this by using JavaScript to request the needed brushes from the server.  The first time that I browsed his site, using Firefox, this worked well. But the second time, using Internet Explorer, an error message popped up in my browser window:</p>
<div class="oowbcenter"><img src="http://img.outofwhatbox.com/OccamsBrush/errorExample.png" alt="SyntaxHighlighter: Can't find brush for: plain" /></div>
<p>David&#8217;s script worked by extending the page&#8217;s <code>&lt;head></code> with a new <code>&lt;script></code> element to load the brushes. Firefox executed the new script as soon as it was added, but IE delayed execution of the new script until the first one was finished. That first script went on to call <code>SyntaxHighlighter.all()</code>. Because the new script hadn&#8217;t been run, the brushes hadn&#8217;t been loaded yet; this resulted in the error that I saw.</p>
<p>David <a href="http://davidchambersdesign.com/prototype-loader-for-syntaxhighlighter/">resolved this problem</a> by modifying his scripts, but I got to wondering: Was there a way to remove that ordering dependency, so that the brushes didn&#8217;t <em>need</em> to be loaded first? I aimed to make this work by splitting the highlighting into two phases:</p>
<ul>
<li>When the web page calls <code>SyntaxHighlighter.all()</code>, SyntaxHighlighter formats any code whose brushes have already been loaded. This is like its previous design, but when it can&#8217;t find a brush, it no longer reports that failure as an error.</li>
<li>If a brush is loaded <em>after</em> the call to <code>SyntaxHighlighter.all()</code>, then SyntaxHighlighter makes another pass through the document, looking for any code that needs to be formatted using the newly-loaded brush.</li>
</ul>
<p>If SyntaxHighlighter can make use of each brush as it&#8217;s loaded, then there&#8217;s no need for the web page to synchronize the brush load operations. This would have simplified David&#8217;s task considerably. Unfortunately, SyntaxHighlighter had no way of <em>knowing</em> when a new brush has been loaded.</p>
<p>The structure of a SyntaxHighlighter brush, as <a href="http://alexgorbatchev.com/wiki/SyntaxHighlighter:Brushes:Custom">currently written</a>, corresponds to the boilerplate code shown below. (The <code>/*</code> <em>comments</em> <code>*/</code> indicate where the contents differ for each brush.)</p>
<pre class="brush: jscript;">
SyntaxHighlighter.brushes./*languageName*/ = function() {
    /* This function determines the styles for a particular syntax */
};
SyntaxHighlighter.brushes./*languageName*/.prototype =
    new SyntaxHighlighter.Highlighter();
SyntaxHighlighter.brushes./*languageName*/.aliases =
    [/* An array of alternate names for this language */ ];
</pre>
<p>Here&#8217;s the brush structure as I&#8217;ve redesigned it, with most of the boilerplate now encapsulated in a new registration method, <code>SyntaxHighlighter.registerBrush</code>:</p>
<pre class="brush: jscript;">
SyntaxHighlighter.registerBrush(
    &quot;/*languageName*/&quot;,
    [/* An array of alternate names for this language */ ],
    function()
    {
        /* This function determines the styles for a particular syntax */
    });
</pre>
<p>The new method, <code>SyntaxHighlighter.registerBrush</code>, attempts to apply the brush as part of the registration process, through an internal call to the method that performs the highlighting. This call returns immediately if the web page hasn&#8217;t yet invoked <code>SyntaxHighlighter.all()</code>, because it can&#8217;t assume that the configuration is ready. (Any pending highlighting will be performed when <code>SyntaxHighlighter.all()</code> is called.) Otherwise, the newly-registered brush is applied to any code blocks that require that brush.</p>
<p>This encapsulation benefits everyone involved: The brush code now needs less knowledge of the highlighter&#8217;s internals, and assumes less responsibility for them. SyntaxHighlighter is given knowledge that it didn&#8217;t previously have: namely, it gets notified as brushes become available. And the web page doesn&#8217;t need any additional logic to orchestrate the brush loading.</p>
<p><a href="http://en.wikipedia.org/wiki/Occam%27s_Razor">Occam&#8217;s Razor</a> is often paraphrased as &#8220;The simplest explanation is best.&#8221; Ironically, that&#8217;s a little too simplistic. The original Latin can be translated into English as, <em>entities must not be multiplied beyond necessity.</em> In software engineering, Occam&#8217;s Razor has a corollary: <em>When entities are multiplying, find out how they became necessary.</em> Doing so may lead to a better simplicity.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.outofwhatbox.com/blog/2009/06/occams-razor-and-brushes/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
