Update of Forked SyntaxHighlighter

As promised, I’ve updated this site’s fork of the SyntaxHighlighter software by Alex Gorbatchev. I’ve merged in many of the changes that Alex included in his 2.1.364 release, but with the following differences:

  • In 2.1.364, the ruler functionality was removed. It’s still included here.
  • 2.1.364 changed the way that wrapped lines are presented. I’ve kept the old presentation, but modified the icon that’s used to signify wrapping.
  • Internally, 2.1.364 uses a separate HTML table to represent each line of source. I’ve maintained the older pattern of simply using a separate <div> to represent each line.

Other new bits in this release:

  • I’ve added a “Hide” button to the toolbar. Code blocks can now be expanded/collapsed at the user’s whim. (As before, an author can still choose to present a block in the hidden/collapsed state.)
  • When both a horizontal scrollbar and a gutter (the line number column) are displayed, the scrollbar no longer extends under the gutter. (I think this is an improvement; as always, feedback is welcome.) Update: I’ve backed out this change; it was causing problems with IE7.

The ZIP file is available through the Download Page.

Do not adjust your eyes; it’s just a new theme.

I’ve just gone live with a different WordPress theme. So, if you’re feeling a bit lost, don’t panic. This is a new theme, iTech, which was just released last month. I’ve tailored the CSS, fixed a bug or two in the php, and even managed to add a new customization option. It’s been a good learning experience. Most importantly, I hope you’ll like the new look. Let me know what you think.

New version of SyntaxHighlighter coming

Some folks have asked me offline if I’m going to be merging Alex Gorbatchev’s 2.1.364 version of SyntaxHighlighter with the OOWB fork that I’ve been supporting here. The answer to that is yes; if all goes well, I should be posting it in the next few days. As for creating a public repository, perhaps on bitbucket: Maybe. If you’d like to see this happen, please let me know. Actually, I’d like to hear from anyone who’s rolled out the OOWB fork: What site(s) are you using it on, and how is it working out? Please leave a comment, or send email.

Did I say “fork”? Uh, yeah. While 2.1.364 is solid, it doesn’t cover most of the improvements that I’ve made (check the Download Page to see a summary.) Ultimately, I’d like to see these changes rolled back into Alex’s version, but that doesn’t appear to be happening anytime soon. There could be good reasons for that; certainly my version looks larger, just for starters. (I think that the total footprint may be smaller, but that’s a tough call.) All I can really say is that we’ll see how it goes. I plan to keep making improvements in the software, at least for my own sake; but I regard Alex as the primary author and will continue to do so.

JavaScript Array Performance, And Why It Matters

My last post described how a JavaScript array had become a performance bottleneck. Here, I’ll delve further into how some JavaScript programs become array-bound, and how to break that bind when you need to.

Let’s start with this:

JavaScript array elements can be much slower than scalar variables

As the graph shows, JavaScript array performance ranges from OK (nearly equivalent to scalars) to awful. The reasons for poor performance vary, but most can be boiled down to this: The JavaScript interpreter is essentially left guessing about how an array will be used. And guess it does. When an array is constructed and initialized, the interpreter can observe where data is stored into a new array. Certain kinds of patterns may nudge it to optimize for better speed, or lower memory consumption. So, performance suffers when the interpreter makes the wrong interpretation.

I looked into performance of JavaScript arrays with three popular Windows browsers: Internet Explorer 8, Google Chrome 3.0, and Firefox 3.5. Broadly speaking, these all seem to look for two or three types of arrays:

  • Dense arrays: These provide faster access to individual elements of the array.
  • Sparse arrays: These are typically optimized for lower memory use.
  • Sized arrays: A variant of sparse arrays that are optimized for speed for certain cases.

The interpreters optimize for dense arrays if the array’s elements are initialized in a continuous range, starting at index 0. This initialization can be done before there’s any data for the array, by using placeholders such as 0, null, or even undefined. (There are other ways to get this optimization; this is simply the most reliable approach.)

Passing the array’s size in the constructor may also bring better performance. In my testing, I saw this only with sparse arrays in IE8 and (perhaps) in Chrome. I’ll refer to these as sized arrays; see below for some additional notes.

If an array is created as neither a dense nor a sized array, it’s treated as a sparse array.

As a general rule, the array’s behavior is established shortly after construction. If you sparsely populate an array, then assign other values to the remaining elements, you’re left with a densely-populated sparse array.1

Hold it there. A densely populated sparse array? Holy leaky abstractions, Batman!

Perhaps we’re lacking some terms. If we’re trying to coax better performance from a sparse array by filling it up with placeholder values, that doesn’t make it a dense array—not from your program’s perspective, at least. To describe it from the interpreter’s perspective, I’ll refer to an array that’s initialized after creation—with live or dead data—as a cleared array. An uninitialized array that is populated at random indices is a default array, regardless of how dense it eventually becomes. Finally, a sized array is, well, a sized array. So, that “densely populated sparse array” is now a “densely populated default array.” That’s still an awkward phrase, but at least it’s not an oxymoron.

Now that we can finally get to some data, let’s discuss scalability. This graph represents the time taken to read about 30,000,000 values from arrays of various sizes, using three Windows browsers. Seven data sets are presented: Cleared and default arrays for each of IE8, Google Chrome 3.0, and Firefox 3.5, plus sized arrays for IE8.

As array size increases, run time may increase as well. (Click for larger graph with legend)

Array Size Vs. Access Time. (Click for larger graph with legend)

Note that in three of the seven cases, time grows linearly as size increases2. This growth effectively multiplies the program’s complexity by O(N). That is, an algorithm that might normally have O(N) performance instead shows O(N2) performance, and so on. This can have a sizable impact on scalability.

The graph makes it clear that the fastest arrays are cleared. However, making a very sparse cleared array would gobble up a large swath of memory for a small number of values. For such cases, you may want to stick with sparse or sized arrays. Here’s a closer look at their performance.

The following tests all used arrays of 180,000 elements. Each data point represents performance with a different “hit ratio”; that is, the ratio of defined values read from the array. While array density also varied, its effect appears in only one case, which I’ll show separately3. For reference, these graphs also show the performance with cleared arrays.

This graph shows the performance of sized, default, and cleared arrays in IE8. Notice that the “sized” trendline (in blue) begins significantly above the “default” trendline, and ends slightly above the “cleared” trendline. Clearly, IE optimizes sized arrays to work better when reading defined values.

Performance of defined and undefined array elements in IE8. (Click for larger graph with legend)

Array Hits Vs. Access Time in IE8. (Click for larger graph with legend)

Firefox 3.5 also looks slower when accessing undefined elements, but there’s no real gain from passing a size parameter to the array constructor:

Performance of defined and undefined array elements in Firefox 3.5. (Click for larger graph with legend)

Array Hits Vs. Access Time in Firefox 3.5. (Click for larger graph with legend)

In Chrome, array access times are clustered into high and low ranges of values, based on array density. Chrome performs markedly better with arrays having a density of 10% or higher, vs. arrays of lower density. The data sets below have been split accordingly. Since there’s much more variation between these two ranges than here is among them, it’s a good bet that the 10% threshold is hard-coded somewhere in Chrome’s V8 JavaScript interpreter.

Array Hits Vs. Access Time in Chrome, by array density. (Click for larger graph with legend)

Array Hits Vs. Access Time in Chrome, by array density. (Click for larger graph with legend)

Notes on sized arrays

It makes sense that if a size is passed to the array constructor, and the size value is accurate, then the interpreter can optimize the array for that size. The problem is that the size value won’t always be accurate; nor does it suggest how dense the array might become. Hence the interpreter may not be able to rely on the size value even when it’s present.

Sized array behavior in IE8 isn’t what I’d expected. I’d found a note written prior to IE8’s release by one of its developers, which I had taken to mean that these arrays should perform like cleared arrays, but that’s clearly not the case. Sized arrays in IE8 actually incur a small penalty when accessing undefined array elements, to the point where if you access only undefined elements, a sized array may be slower than a default array. On a closer reading, the note refers to “any indexed entry” within the array’s range [emphasis mine.] Of course, undefined entries wouldn’t be indexed. The note was accurate, but arguably didn’t go far enough.

In Chrome, using an explicit size for a sparse array does seem to make a small but measurable difference in some cases. But on the whole, there’s little reason to use this for improved performance, especially since Chrome is currently the browser least in need of a performance boost in these tests.

The performance of Firefox 3.5 suggests that it ignores the constructor’s size parameter. However, a quick trip through the browser’s source code indicates that the size should make a difference. Perhaps other kinds of tests would be able to draw this out, or perhaps there’s an opportunity for improvement in the code.

Memory Consumption

Measuring the physical size of JavaScript data is a sketchy undertaking. About the best one can do is to compare the virtual memory size of the browser process before and after creating a large array. This isn’t going to be very accurate. All the same, where there’s a large difference between values, it’s probably significant.

Browser Size Estimate, Default / Sized Arrays (Bytes) Size Estimate, Cleared Arrays (Bytes)
Internet Explorer 8 (# of values) x (approximately 76) (Length of array) x (approx. 46)
Firefox 3.5 (# of values) x (approx. 63) (Length of array) x (approx. 5)
Google Chrome 3.0 (Length of array) x (30–70) (Length of array) x (approx. 14)

In my last post, I wrote that a sparse 12K array of integers, implemented as a hash table, would likely consume over 36K of memory. If these measurements are reasonable, that was a gross understatement; a sparse 12K array could actually consume more than 70 bytes per element, or upwards of 860K.

At the extremes, for two arrays with the same length, one that’s very dense may use less memory than one of lower density—even though the lower density array, by definition, contains fewer values. Chrome and Firefox seem to account for this internally as they organize the array structure, but I’m not sure whether IE8 does.

Recommendations

Remember that premature optimization is folly, and all optimization has its costs. If you use the hints that I’ve described, keep in mind that the interpreter isn’t obliged to obey your intentions. Consider creating a factory method for arrays, so that if as the rules change, you can most easily adapt your code to suit.
  • Use cleared arrays if speed is critical, or if the array reaches around 50% density. But don’t use them habitually, especially not for sparse arrays.
  • Because of IE8’s quirks, you should think twice before creating sized arrays. They’re helpful only if you have sparse data and you won’t often access an undefined array element.

Looking ahead

JavaScript’s arrays are pleasant enough in normal use; but like all abstractions, they have their leaks. As JavaScript is used for increasingly sophisticated applications, it might be worthwhile for its designers to take a fresh look at scalability. There are ways to get the desired results, but negotiation via secret handshake doesn’t scale terribly well.

Without marring JavaScript’s simplicity, it should be possible to extend the language so that, when necessary, the developer can make plain to the interpreter what it should expect for a particular array. For example, this could mean adding APIs or syntax that let the developer declare the array’s expected size, density, and so forth. Or the problem could be addressed from the other direction: Simply allow the developer to request a structure that favors higher speed, or more compactness, for a particular array.

And although read-only objects may be a special case, I started down this path by looking at an array used as a lookup table which is effectively read-only after initialization. It would be nice if I could let the interpreter know this. A version of Ruby’s freeze method would fit the bill. This would give the interpreter a hint to optimize the array for read-only access, though it wouldn’t be required to do so.

Dreaming further, I’m also holding out hope that closures can be better optimized. After initialization, the lookup table in my version of String.trim() was only accessible to a single, read-only method. If the interpreter can verify that nothing’s going to change the array, it could move it to faster storage. Yes, this brings us back around to secret handshakes. But since closures have their own merits, I see this more as the icing on the cake.


1 Making a fresh copy using slice(0) should get you better performance. (Well, it’s worked for me, but I make no promises.)

2 There are really four cases, but the fourth one (cleared arrays in IE8) shows very slow growth in runtime. Also note that the numbers suggest that Chrome’s performance improves slightly as array size increases. This may be an artifact of the benchmark.

3 Density and hit ratio can be covariant, but here they aren’t. That is, these tests were designed so that, as long as the density was under 100%, the hit ratio could vary independently.

JavaScript Array Performance: Initialize to Optimize

After delving into the issue of JavaScript array performance, I came upon a pair of blog entries on MSDN addressing the topic. These posts describe situations where the then-forthcoming IE8 could provide more performant arrays. The blog also partially confirms, and partially refutes, my thoughts on how JavaScript interpreters handle arrays. The blog’s focus, of course, is the JScript engine in IE, but the advice that it offers seems to work well with other major browsers.

The blog confirms that IE’s JavaScript interpreter manages arrays using a nonlinear structure, essentially a hash table. I’d guessed at this after seeing lower performance in trimOOWB when its lookup table was implemented as a large, and sparse, array. But contrary to my guess, this isn’t about avoiding re-allocation bottlenecks. It’s for dealing with sparse arrays without consuming unreasonable amounts of memory1. (I’m feeling a little sheepish about this: Even though I’d noted the sparseness of the lookup table, I hadn’t made the connection. In fact, I’d been somewhat dismissive of that possibility. Live and learn.)

The blog also says that as of IE8, the JScript engine has heuristics for determining if an array is “dense”. It implements a dense array with a linear, array-like index in addition to its standard hash-like data structure; this index can make random access into the array much faster. The blog indicates that IE8 considers an array to be dense if either of the following conditions is true:

  • You construct the array with an explicit size, and you don’t grow the array past that initial size.
  • After creating the array, you initialize a continuous range of indices, starting from 0, up to and including the highest index that you expect to use. You must do this before writing into random indices within the array.2

As the writer says, using either of these two techniques should ensure that the array heuristics in IE 8 will treat your array as a dense array. The interpreter is free to decide to treat other arrays as dense, too. However, the testing I’ve done suggests that it’s not quite as simple as this. Contrary to the blog’s guidelines, I found that:

  • Creating the array with an explicit size (whether or not followed by initializing the elements) had no measurable impact on the speed of the trim method.
  • Initializing the lookup table’s full range of 12,289 elements (the vast majority of which are not whitespace) did improve performance.

This finding is specific to the trim method, and so these results should not be used as general performance guidelines. The usage pattern for a particular array can greatly affect its performance profile. I’ve found unrelated cases where specifying the array’s size could help or hurt performance. (I’ll write more about this in my next post.)

Following up on these hints led to a new version of trim that’s simpler and faster than my previous effort. This version is actually more closely related to the trim17 method from Yesudeep Mangalapilly than it is to trimOOWB. Hence I’ve named this newer one trim18 to reaffirm its heritage. (In Steve Levithan’s original post on JavaScript trim methods, he assigned numbers to each of the trim implementations that he examined. Yesudeep took up that naming scheme, and now I’m doing so as well.)

An Intermediary Thought

Remember that premature optimization is folly, and all optimization has its costs. In this case, we’re buying performance by fully initializing a 12K JavaScript array. If the interpreter stores this using a hash table as well as an array, then the table will likely consume over 36K of memory. Is the benefit worth the cost? It could be, especially if we’re not running on a cell phone. But please don’t take away from this that you should initialize every array that you create. That would ultimately be self-defeating: One of the worst things you can do for performance is to consume more memory than you really need.

Performance

Since the last post, I’ve revised the benchmark for better precision. The structure is the same—both benchmarks call the three trim methods repeatedly with a fixed set of input data—but in the newer benchmark, the input strings are much longer, and the number of iterations is lower. This should yield more precise measurements of execution time. All the same, these results should be used with care; as is often said in the U.S.A, your mileage will vary.

The benchmark results are shown below. Please remember that these tables are not directly comparable to the numbers in my previous post.

ASCII data (only spaces and tabs used for whitespace)

trim17 trimOOWB % saved vs. trim17 trim18 % saved vs. trim17
Internet Explorer 8 74,453 69,609 6.5 69,922 6.1
Firefox 3.5 6,776 3,732 44.9 3,003 55.7
Google Chrome 3.0 2,530 824 67.4 754 70.2

Unicode data (using all ASCII and Unicode whitespace characters)

trim17 trimOOWB % saved vs. trim17 trim18 % saved vs. trim17
Internet Explorer 8 76,188 74,468 2.3 72,484 4.9
Firefox 3.5 6,779 5,025 25.6 3,029 55.3
Google Chrome 3.0 2,780 940 66.2 810 70.9

trim18 implementation


I left the explicit size in the array constructor call, even though I’d found no benefit from using it. It seems unlikely to cause any harm, and there may be environments where this method’s performance might benefit from it.

var trim18 = (function() {

    var tableSize = 0x3000 + 1;
    var whiteSpace = new Array(tableSize);

    // Initialize the array elements before populating the data.
    // (This may help performance, by hinting to the interpreter that
    //  the array should not be managed as a sparse array.)

    for (var i = 0; i < tableSize; i++) {
        whiteSpace[i] = false;
    }

    whiteSpace[0x0009] = true;  whiteSpace[0x000a] = true;
    whiteSpace[0x000b] = true;  whiteSpace[0x000c] = true;
    whiteSpace[0x000d] = true;  whiteSpace[0x0020] = true;
    whiteSpace[0x0085] = true;  whiteSpace[0x00a0] = true;
    whiteSpace[0x1680] = true;  whiteSpace[0x180e] = true;
    whiteSpace[0x2000] = true;  whiteSpace[0x2001] = true;
    whiteSpace[0x2002] = true;  whiteSpace[0x2003] = true;
    whiteSpace[0x2004] = true;  whiteSpace[0x2005] = true;
    whiteSpace[0x2006] = true;  whiteSpace[0x2007] = true;
    whiteSpace[0x2008] = true;  whiteSpace[0x2009] = true;
    whiteSpace[0x200a] = true;  whiteSpace[0x200b] = true;
    whiteSpace[0x2028] = true;  whiteSpace[0x2029] = true;
    whiteSpace[0x202f] = true;  whiteSpace[0x205f] = true;
    whiteSpace[0x3000] = true;

    function trim18(str) {
        var len = str.length, ws = whiteSpace, i = 0;
        while (ws[str.charCodeAt(--len)]);
        if (++len){
            while (ws[str.charCodeAt(i)]){ ++i; }
        }
        return str.substring(i, len);
    }

    return trim18;
})();

1 It’s also relevant that every JavaScript object needs a hash table to manage properties. Hence sparse arrays can be implemented easily by using this hash table for the same purpose. (Considering the design of JavaScript’s for...in loop statement, it looks as if the language designers intended this.)
2 That is, you would need to initialize the array if you otherwise won’t be populating all of its elements, or if you won’t be populating them in strict order starting from 0. On the other hand, if your script would normally add data to the array starting from index 0 and working up from there, leaving no gaps, then you’re already squared away with IE8’s heuristics.