Back in 2007, Steve Levithan compared the speed of different implementations for the missing JavaScript String.trim()
function. Steve’s blog post has launched The Comment Thread That Will Not Die, as a number of folks have been tempted to try their hand at writing their own implementation.
Count me in.
It started when Yesudeep Mangalapilly’s version caught my attention. Yesudeep, working with an idea from Michael Lee Finney, had a fast implementation that didn’t use regular expressions. Instead of regexps, Yesudeep’s and Michael’s versions scan the string one character at a time, from the front and back ends, checking each character against a lookup table to determine if it’s whitespace.
However: The largest Unicode code point that’s counted as whitespace is U+3000 (pdf) (12288 in decimal), the Ideographic Space character. Hence, the lookup table array in Michael’s and Yasudeep’s implementations has a length of 12289, with most entries undefined. That’s a pretty large array, and a pretty sparse one.
Even though these were already among the fastest of the trim
s, I wondered if using a large array as a lookup table might carry any performance penalty. My concern stemmed from the fact that JavaScript arrays grow dynamically, adjusting in size to hold the highest index assigned into them. This poses a challenge to the interpreter: If it always places an array in a linear block of memory (as in C++), then accommodating array growth is likely to be a problem. So, to allow for reasonable performance at the array grows, the interpreter might not use a linear storage model for arrays. Non-linear models (trees or linked lists, for example) may make random access to the array slower, but would allow for reasonable performance when growing the array, while exhibiting reasonable memory consumption1.
Through testing in three popular browsers, I found reason to be concerned about large arrays. I profiled the trim17
method (Yasudeep’s implementation), using input strings that contained only spaces and tabs for whitespace. After this profiling run, I trimmed trim17
‘s lookup table—hacked it, really—by removing all entries above U+0020, and so limiting it to recognizing only ASCII whitespace chars. Then I profiled it again.
The table below shows the milliseconds spent within the two versions of trim17
; the difference between their runtimes is most likely due to the change in size of the lookup table array.
Original trim17 |
Reduced trim17 |
% Saved | |
---|---|---|---|
Internet Explorer 8 |
27,328 |
20,281 |
26 |
Firefox 3.5 |
3,689 |
2,978 |
20 |
Chrome 3.0 |
610 |
191 |
69 |
These results confirm that, in JavaScript, accessing larger arrays can be slower than accessing smaller arrays.
But, perhaps applying these results to all arrays is an overgeneralization. Ideally, at least, it should be possible for an interpreter to recognize a “read-only” array, and use a more efficient layout for it. That is, if the interpreter can verify that an array isn’t modified after its initial construction, then perhaps it can safely flatten out the array into a linear block of memory.
The array in trim17
was constructed as a property of the String
prototype. An array couldn’t be more modifiable than that, and so I wouldn’t expect it to be flattened by the interpreter. But suppose the array were accessible only from within a single function (a closure). In that case, if that single function isn’t modifying the array, we know that nothing will. Depending on the JavaScript interpreter, that might allow for better performance.
I changed the code accordingly, but the actual improvement in speed was… unremarkable. Nonexistent, even. It may have eked out around a 5% gain in some tests, but there’s enough noise in the measurements that it’s hard to be sure. Still, I dislike globals (and, especially, modifiable globals), so I decided to stick with the closure. (Besides, maybe someday, somewhere, an optimizing interpreter will know how to make use of it.)
The next approach was to try using a smaller array. Or, rather, two smaller arrays: One to represent the whitespace characters at and below U+0020, and another to represent the whitespace characters between U+2000 and U+205f. To keep the second array small, its indices are offset by –0x2000
; this gives it a size of 0x0060
(96 decimal) entries. (There are three whitespace values that aren’t in either array: U+1680, U+180e, and U+3000. The new code checks for these explicitly.)
Even with smaller arrays, I found that random access into them is still slower than making a few comparisons on scalar variables. Hence the code is written so that, for any character value, it consults no more than one of the two lookup tables, and then only if the character is in a reasonable range for that table.
Here’s the new method:
var trimOOWB = (function() { var whiteSpace = new Array(0x00a0 + 1); whiteSpace[0x0009] = true; whiteSpace[0x000a] = true; whiteSpace[0x000b] = true; whiteSpace[0x000c] = true; whiteSpace[0x000d] = true; whiteSpace[0x0020] = true; whiteSpace[0x0085] = true; whiteSpace[0x00a0] = true; var whiteSpace2 = new Array(0x005f + 1); var base = 0x2000; whiteSpace2[0x2000 - base] = true; whiteSpace2[0x2001 - base] = true; whiteSpace2[0x2002 - base] = true; whiteSpace2[0x2003 - base] = true; whiteSpace2[0x2004 - base] = true; whiteSpace2[0x2005 - base] = true; whiteSpace2[0x2006 - base] = true; whiteSpace2[0x2007 - base] = true; whiteSpace2[0x2008 - base] = true; whiteSpace2[0x2009 - base] = true; whiteSpace2[0x200a - base] = true; whiteSpace2[0x200b - base] = true; whiteSpace2[0x2028 - base] = true; whiteSpace2[0x2029 - base] = true; whiteSpace2[0x202f - base] = true; whiteSpace2[0x205f - base] = true; function trimOOWB2(str) { var ws = whiteSpace, ws2 = whiteSpace2; var i=0, len=str.length, ch; while ((ch = str.charCodeAt(--len)) && (ch <= 0x00A0 ? ws[ch] : (ch >= 0x2000 ? (ch===0x3000 || ws2[ch - 0x2000]) : (ch===0x1680 || ch===0x180e )))) ; if (++len) { while ((ch = str.charCodeAt(i)) && (ch <= 0x00A0 ? ws[ch] : (ch >= 0x2000 ? (ch===0x3000 || ws2[ch - 0x2000]) : (ch===0x1680 || ch===0x180e )))) { ++i; } } return str.substring(i, len); } return trimOOWB2; })();
I benchmarked the new function (trimOOWB
) and Yesudeep’s trim17
function, using two sets of input strings. In the first test, only legacy ASCII whitespace characters (e.g., spaces and tabs) were used, as in the test above. The second test data set used the full set of whitespace in the Unicode character set. The numbers shown are milliseconds spent within the trim functions; lower is better.
trim17 (ASCII) |
trimOOWB (ASCII) |
% saved |
trim17 (Unicode) |
trimOOWB (Unicode) |
% saved |
|
---|---|---|---|---|---|---|
Internet Explorer 8 | 25,953 | 22,359 | 14 | 27,469 | 26,500 | 4 |
Firefox 3.5 | 3,706 | 3,089 | 17 | 3,831 | 3,439 | 10 |
Google Chrome 3.0 | 604 | 139 | 77 | 664 | 166 | 75 |
A final thought
It’s interesting that there’s a direct correlation between the base time for the browser to execute trim17
, and the percentage of time saved by trimOOWB
. In other words, the percentage of performance boost from using smaller arrays increases as the browser’s speed increases: IE showed the highest time and lowest gain, followed by Firefox on both counts, and finishing with Chrome, which had the lowest base time and the highest percentage gain.
I’m guessing here, but I think the easiest way to explain this is that all three JavaScript interpreters are using roughly equivalent strategies for managing arrays. The percentages gained are different because the same absolute time savings in Chrome results in a higher relative performance boost when compared to IE or Firefox.
That raises a question, though: Is it possible that the structure of trimOOWB
gives it any performance advantage over trim17
aside from the savings generated through reducing the array size? I’ve looked for such artifacts in the tests. In short, and skipping the details for now, I think it’s likely that such artifacts couldn’t account for more than one quarter of the overall speed boost; it’s probably much less than that. It’s at least as likely that the overhead added by trimOOWB
is obscuring part of the overall performance boost.
Another final thought
The multiple comparisons made in trimOOWB
might raise the question: Why not try using a switch
statement instead of all those conditionals? Well, I did try this, with mixed results. On one hand, Firefox showed a significant speedup, around 25%. On the other hand, IE may have been a little slower, and Chrome was more than twice as slow. (Besides, it was a switch
statement. We’re looking for speed, but a fellow’s got to have some standards.)
A final final thought
Using a closure offers another advantage: It’s simple to redirect calls to trim
to the native String.trim() function, if the browser supports it. All that’s required is to change the return
in the outer (anonymous) function from this:
return trimOOWB2;
to:
return String.trim || trimOOWB2;
ECMAScript 5.0 is slated to include String.trim
, so it’s probably worth thinking ahead for this. In Firefox 3.5—the only current browser that I know of that supports String.trim
—calls to the native String.trim
run in a fraction of the time of any of the JavaScript implementations.
String.trim
does not count U+1680 or U+180E as whitespace. It does treat U+3000 as whitespace.