Out Of What Box?

JavaScript Array Performance, And Why It Matters

December 1 2009 by Dan Breslau in JavaScript, Software Development, Web |1 Comment

My last post described how a JavaScript array had become a performance bottleneck. Here, I’ll delve further into how some JavaScript programs become array-bound, and how to break that bind when you need to.

Let’s start with this:

JavaScript array elements can be much slower than scalar variables

As the graph shows, JavaScript array performance ranges from OK (nearly equivalent to scalars) to awful. The reasons for poor performance vary, but most can be boiled down to this: The JavaScript interpreter is essentially left guessing about how an array will be used. And guess it does. When an array is constructed and initialized, the interpreter can observe where data is stored into a new array. Certain kinds of patterns may nudge it to optimize for better speed, or lower memory consumption. So, performance suffers when the interpreter makes the wrong interpretation.

I looked into performance of JavaScript arrays with three popular Windows browsers: Internet Explorer 8, Google Chrome 3.0, and Firefox 3.5. Broadly speaking, these all seem to look for two or three types of arrays:

Dense arrays: These provide faster access to individual elements of the array.
Sparse arrays: These are typically optimized for lower memory use.
Sized arrays: A variant of sparse arrays that are optimized for speed for certain cases.

The interpreters optimize for dense arrays if the array’s elements are initialized in a continuous range, starting at index 0. This initialization can be done before there’s any data for the array, by using placeholders such as 0, null, or even undefined. (There are other ways to get this optimization; this is simply the most reliable approach.)

Passing the array’s size in the constructor may also bring better performance. In my testing, I saw this only with sparse arrays in IE8 and (perhaps) in Chrome. I’ll refer to these as sized arrays; see below for some additional notes.

If an array is created as neither a dense nor a sized array, it’s treated as a sparse array.

As a general rule, the array’s behavior is established shortly after construction. If you sparsely populate an array, then assign other values to the remaining elements, you’re left with a densely-populated sparse array.¹

Hold it there. A densely populated sparse array? Holy leaky abstractions, Batman!

Perhaps we’re lacking some terms. If we’re trying to coax better performance from a sparse array by filling it up with placeholder values, that doesn’t make it a dense arrayâ€”not from your program’s perspective, at least. To describe it from the interpreter’s perspective, Iâ€™ll refer to an array thatâ€™s initialized after creationâ€”with live or dead dataâ€”as a cleared array. An uninitialized array that is populated at random indices is a default array, regardless of how dense it eventually becomes. Finally, a sized array is, well, a sized array. So, that “densely populated sparse array” is now a “densely populated default array.” That’s still an awkward phrase, but at least it’s not an oxymoron.

Now that we can finally get to some data, let’s discuss scalability. This graph represents the time taken to read about 30,000,000 values from arrays of various sizes, using three Windows browsers. Seven data sets are presented: Cleared and default arrays for each of IE8, Google Chrome 3.0, and Firefox 3.5, plus sized arrays for IE8.

As array size increases, run time may increase as well. (Click for larger graph with legend)

Array Size Vs. Access Time. (Click for larger graph with legend)

Note that in three of the seven cases, time grows linearly as size increases². This growth effectively multiplies the program’s complexity by O(N). That is, an algorithm that might normally have O(N) performance instead shows O(N²) performance, and so on. This can have a sizable impact on scalability.

The graph makes it clear that the fastest arrays are cleared. However, making a very sparse cleared array would gobble up a large swath of memory for a small number of values. For such cases, you may want to stick with sparse or sized arrays. Here’s a closer look at their performance.

The following tests all used arrays of 180,000 elements. Each data point represents performance with a different “hit ratio”; that is, the ratio of defined values read from the array. While array density also varied, its effect appears in only one case, which I’ll show separately³. For reference, these graphs also show the performance with cleared arrays.

This graph shows the performance of sized, default, and cleared arrays in IE8. Notice that the “sized” trendline (in blue) begins significantly above the “default” trendline, and ends slightly above the “cleared” trendline. Clearly, IE optimizes sized arrays to work better when reading defined values.

Performance of defined and undefined array elements in IE8. (Click for larger graph with legend)

Array Hits Vs. Access Time in IE8. (Click for larger graph with legend)

Firefox 3.5 also looks slower when accessing undefined elements, but there’s no real gain from passing a size parameter to the array constructor:

Performance of defined and undefined array elements in Firefox 3.5. (Click for larger graph with legend)

Array Hits Vs. Access Time in Firefox 3.5. (Click for larger graph with legend)

In Chrome, array access times are clustered into high and low ranges of values, based on array density. Chrome performs markedly better with arrays having a density of 10% or higher, vs. arrays of lower density. The data sets below have been split accordingly. Since there’s much more variation between these two ranges than here is among them, it’s a good bet that the 10% threshold is hard-coded somewhere in Chrome’s V8 JavaScript interpreter.

Array Hits Vs. Access Time in Chrome, by array density. (Click for larger graph with legend)

Notes on sized arrays

It makes sense that if a size is passed to the array constructor, and the size value is accurate, then the interpreter can optimize the array for that size. The problem is that the size value won’t always be accurate; nor does it suggest how dense the array might become. Hence the interpreter may not be able to rely on the size value even when it’s present.

Sized array behavior in IE8 isn’t what I’d expected. I’d found a note written prior to IE8’s release by one of its developers, which I had taken to mean that these arrays should perform like cleared arrays, but that’s clearly not the case. Sized arrays in IE8 actually incur a small penalty when accessing undefined array elements, to the point where if you access only undefined elements, a sized array may be slower than a default array. On a closer reading, the note refers to “any indexed entry” within the array’s range [emphasis mine.] Of course, undefined entries wouldn’t be indexed. The note was accurate, but arguably didn’t go far enough.

In Chrome, using an explicit size for a sparse array does seem to make a small but measurable difference in some cases. But on the whole, there’s little reason to use this for improved performance, especially since Chrome is currently the browser least in need of a performance boost in these tests.

The performance of Firefox 3.5 suggests that it ignores the constructor’s size parameter. However, a quick trip through the browser’s source code indicates that the size should make a difference. Perhaps other kinds of tests would be able to draw this out, or perhaps there’s an opportunity for improvement in the code.

Memory Consumption

Measuring the physical size of JavaScript data is a sketchy undertaking. About the best one can do is to compare the virtual memory size of the browser process before and after creating a large array. This isn’t going to be very accurate. All the same, where there’s a large difference between values, it’s probably significant.

`Browser`	`Size Estimate, Default / Sized Arrays (Bytes)`	`Size Estimate, Cleared Arrays (Bytes)`
Internet Explorer 8	(# of values) x (approximately 76)	(Length of array) x (approx. 46)
Firefox 3.5	(# of values) x (approx. 63)	(Length of array) x (approx. 5)
Google Chrome 3.0	(Length of array) x (30â€“70)	(Length of array) x (approx. 14)

In my last post, I wrote that a sparse 12K array of integers, implemented as a hash table, would likely consume over 36K of memory. If these measurements are reasonable, that was a gross understatement; a sparse 12K array could actually consume more than 70 bytes per element, or upwards of 860K.

At the extremes, for two arrays with the same length, one that’s very dense may use less memory than one of lower densityâ€”even though the lower density array, by definition, contains fewer values. Chrome and Firefox seem to account for this internally as they organize the array structure, but I’m not sure whether IE8 does.

Recommendations

Remember that premature optimization is folly, and all optimization has its costs. If you use the hints that I’ve described, keep in mind that the interpreter isn’t obliged to obey your intentions. Consider creating a factory method for arrays, so that if as the rules change, you can most easily adapt your code to suit.

Use cleared arrays if speed is critical, or if the array reaches around 50% density. But don’t use them habitually, especially not for sparse arrays.
Because of IE8’s quirks, you should think twice before creating sized arrays. They’re helpful only if you have sparse data and you won’t often access an undefined array element.

Looking ahead

JavaScript’s arrays are pleasant enough in normal use; but like all abstractions, they have their leaks. As JavaScript is used for increasingly sophisticated applications, it might be worthwhile for its designers to take a fresh look at scalability. There are ways to get the desired results, but negotiation via secret handshake doesn’t scale terribly well.

Without marring JavaScript’s simplicity, it should be possible to extend the language so that, when necessary, the developer can make plain to the interpreter what it should expect for a particular array. For example, this could mean adding APIs or syntax that let the developer declare the array’s expected size, density, and so forth. Or the problem could be addressed from the other direction: Simply allow the developer to request a structure that favors higher speed, or more compactness, for a particular array.

And although read-only objects may be a special case, I started down this path by looking at an array used as a lookup table which is effectively read-only after initialization. It would be nice if I could let the interpreter know this. A version of Ruby’s freeze method would fit the bill. This would give the interpreter a hint to optimize the array for read-only access, though it wouldn’t be required to do so.

Dreaming further, I’m also holding out hope that closures can be better optimized. After initialization, the lookup table in my version of String.trim() was only accessible to a single, read-only method. If the interpreter can verify that nothing’s going to change the array, it could move it to faster storage. Yes, this brings us back around to secret handshakes. But since closures have their own merits, I see this more as the icing on the cake.

¹ Making a fresh copy using slice(0) should get you better performance. (Well, it’s worked for me, but I make no promises.)

² There are really four cases, but the fourth one (cleared arrays in IE8) shows very slow growth in runtime. Also note that the numbers suggest that Chrome’s performance improves slightly as array size increases. This may be an artifact of the benchmark.

³ Density and hit ratio can be covariant, but here they aren’t. That is, these tests were designed so that, as long as the density was under 100%, the hit ratio could vary independently.

Tags: JavaScript, JavaScript arrays, JavaScript performance

One Comment

Tweets that mention JavaScript Array Performance, And Why It Matters | Out Of What Box? -- Topsy.com says:

December 3, 2009 at 5:57 pm

[…] This post was mentioned on Twitter by Alexandre Morgaut, Christian Blanquera. Christian Blanquera said: He said what I was thinking. http://icio.us/1ecruf […]

Reply

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

JavaScript Array Performance, And Why It Matters

Notes on sized arrays

Memory Consumption

Recommendations

Looking ahead

One Comment

Leave a Reply

Recent Posts

Blogroll

WordPress