Trimming trim via razing arrays (JavaScript)

Back in 2007, Steve Levithan compared the speed of different implementations for the missing JavaScript String.trim() function. Steve’s blog post has launched The Comment Thread That Will Not Die, as a number of folks have been tempted to try their hand at writing their own implementation.

Count me in.

It started when Yesudeep Mangalapilly’s version caught my attention. Yesudeep, working with an idea from Michael Lee Finney, had a fast implementation that didn’t use regular expressions. Instead of regexps, Yesudeep’s and Michael’s versions scan the string one character at a time, from the front and back ends, checking each character against a lookup table to determine if it’s whitespace.

However: The largest Unicode code point that’s counted as whitespace is U+3000 (pdf) (12288 in decimal), the Ideographic Space character. Hence, the lookup table array in Michael’s and Yasudeep’s implementations has a length of 12289, with most entries undefined. That’s a pretty large array, and a pretty sparse one.

Even though these were already among the fastest of the trims, I wondered if using a large array as a lookup table might carry any performance penalty. My concern stemmed from the fact that JavaScript arrays grow dynamically, adjusting in size to hold the highest index assigned into them. This poses a challenge to the interpreter: If it always places an array in a linear block of memory (as in C++), then accommodating array growth is likely to be a problem. So, to allow for reasonable performance at the array grows, the interpreter might not use a linear storage model for arrays. Non-linear models (trees or linked lists, for example) may make random access to the array slower, but would allow for reasonable performance when growing the array, while exhibiting reasonable memory consumption1.

Through testing in three popular browsers, I found reason to be concerned about large arrays. I profiled the trim17 method (Yasudeep’s implementation), using input strings that contained only spaces and tabs for whitespace. After this profiling run, I trimmed trim17‘s lookup table—hacked it, really—by removing all entries above U+0020, and so limiting it to recognizing only ASCII whitespace chars. Then I profiled it again.

The table below shows the milliseconds spent within the two versions of trim17; the difference between their runtimes is most likely due to the change in size of the lookup table array.

Original trim17 Reduced trim17 % Saved
Internet Explorer 8

27,328

20,281

26

Firefox 3.5

3,689

2,978

20

Chrome 3.0

610

191

69

These results confirm that, in JavaScript, accessing larger arrays can be slower than accessing smaller arrays.


But, perhaps applying these results to all arrays is an overgeneralization. Ideally, at least, it should be possible for an interpreter to recognize a “read-only” array, and use a more efficient layout for it. That is, if the interpreter can verify that an array isn’t modified after its initial construction, then perhaps it can safely flatten out the array into a linear block of memory.

The array in trim17 was constructed as a property of the String prototype. An array couldn’t be more modifiable than that, and so I wouldn’t expect it to be flattened by the interpreter. But suppose the array were accessible only from within a single function (a closure). In that case, if that single function isn’t modifying the array, we know that nothing will. Depending on the JavaScript interpreter, that might allow for better performance.

I changed the code accordingly, but the actual improvement in speed was… unremarkable. Nonexistent, even. It may have eked out around a 5% gain in some tests, but there’s enough noise in the measurements that it’s hard to be sure. Still, I dislike globals (and, especially, modifiable globals), so I decided to stick with the closure. (Besides, maybe someday, somewhere, an optimizing interpreter will know how to make use of it.)

The next approach was to try using a smaller array. Or, rather, two smaller arrays: One to represent the whitespace characters at and below U+0020, and another to represent the whitespace characters between U+2000 and U+205f. To keep the second array small, its indices are offset by –0x2000; this gives it a size of 0x0060 (96 decimal) entries. (There are three whitespace values that aren’t in either array: U+1680, U+180e, and U+3000. The new code checks for these explicitly.)

Even with smaller arrays, I found that random access into them is still slower than making a few comparisons on scalar variables. Hence the code is written so that, for any character value, it consults no more than one of the two lookup tables, and then only if the character is in a reasonable range for that table.

Here’s the new method:

var trimOOWB = (function() {

 var whiteSpace = new Array(0x00a0 + 1);
 whiteSpace[0x0009] = true;    whiteSpace[0x000a] = true;
 whiteSpace[0x000b] = true;    whiteSpace[0x000c] = true;
 whiteSpace[0x000d] = true;    whiteSpace[0x0020] = true;
 whiteSpace[0x0085] = true;    whiteSpace[0x00a0] = true;

 var whiteSpace2 = new Array(0x005f + 1);
 var base = 0x2000;
 whiteSpace2[0x2000 - base] = true;  whiteSpace2[0x2001 - base] = true;
 whiteSpace2[0x2002 - base] = true;  whiteSpace2[0x2003 - base] = true;
 whiteSpace2[0x2004 - base] = true;  whiteSpace2[0x2005 - base] = true;
 whiteSpace2[0x2006 - base] = true;  whiteSpace2[0x2007 - base] = true;
 whiteSpace2[0x2008 - base] = true;  whiteSpace2[0x2009 - base] = true;
 whiteSpace2[0x200a - base] = true;  whiteSpace2[0x200b - base] = true;
 whiteSpace2[0x2028 - base] = true;  whiteSpace2[0x2029 - base] = true;
 whiteSpace2[0x202f - base] = true;  whiteSpace2[0x205f - base] = true;


    function trimOOWB2(str) {
        var ws = whiteSpace, ws2 = whiteSpace2;
        var i=0, len=str.length, ch;
        while ((ch = str.charCodeAt(--len)) &&
               (ch <= 0x00A0 ? ws[ch] :
                (ch >= 0x2000 ? (ch===0x3000 || ws2[ch - 0x2000])
                 : (ch===0x1680 || ch===0x180e ))))
            ;
        
        if (++len) {
            while ((ch = str.charCodeAt(i)) &&
                   (ch <= 0x00A0 ? ws[ch] :
                    (ch >= 0x2000 ? (ch===0x3000 || ws2[ch - 0x2000])
                     : (ch===0x1680 || ch===0x180e )))) {
                ++i; 
            }
        }
        return str.substring(i, len);
    }

    return trimOOWB2;
})();

I benchmarked the new function (trimOOWB) and Yesudeep’s trim17 function, using two sets of input strings. In the first test, only legacy ASCII whitespace characters (e.g., spaces and tabs) were used, as in the test above. The second test data set used the full set of whitespace in the Unicode character set. The numbers shown are milliseconds spent within the trim functions; lower is better.

trim17 (ASCII) trimOOWB (ASCII) % saved trim17 (Unicode) trimOOWB (Unicode) % saved
Internet Explorer 8 25,953 22,359 14 27,469 26,500 4
Firefox 3.5 3,706 3,089 17 3,831 3,439 10
Google Chrome 3.0 604 139 77 664 166 75

A final thought

It’s interesting that there’s a direct correlation between the base time for the browser to execute trim17, and the percentage of time saved by trimOOWB. In other words, the percentage of performance boost from using smaller arrays increases as the browser’s speed increases: IE showed the highest time and lowest gain, followed by Firefox on both counts, and finishing with Chrome, which had the lowest base time and the highest percentage gain.

I’m guessing here, but I think the easiest way to explain this is that all three JavaScript interpreters are using roughly equivalent strategies for managing arrays. The percentages gained are different because the same absolute time savings in Chrome results in a higher relative performance boost when compared to IE or Firefox.

That raises a question, though: Is it possible that the structure of trimOOWB gives it any performance advantage over trim17 aside from the savings generated through reducing the array size? I’ve looked for such artifacts in the tests. In short, and skipping the details for now, I think it’s likely that such artifacts couldn’t account for more than one quarter of the overall speed boost; it’s probably much less than that. It’s at least as likely that the overhead added by trimOOWB is obscuring part of the overall performance boost.

Another final thought

The multiple comparisons made in trimOOWB might raise the question: Why not try using a switch statement instead of all those conditionals? Well, I did try this, with mixed results. On one hand, Firefox showed a significant speedup, around 25%. On the other hand, IE may have been a little slower, and Chrome was more than twice as slow. (Besides, it was a switch statement. We’re looking for speed, but a fellow’s got to have some standards.)

A final final thought

Using a closure offers another advantage: It’s simple to redirect calls to trim to the native String.trim() function, if the browser supports it. All that’s required is to change the return in the outer (anonymous) function from this:

    return trimOOWB2;

to:

    return String.trim || trimOOWB2;

ECMAScript 5.0 is slated to include String.trim, so it’s probably worth thinking ahead for this. In Firefox 3.5—the only current browser that I know of that supports String.trim—calls to the native String.trim run in a fraction of the time of any of the JavaScript implementations.

Note: Firefox’s native implementation of String.trim does not count U+1680 or U+180E as whitespace. It does treat U+3000 as whitespace.

1Yes, I wrote “JavaScript” and “reasonable memory consumption” in the same article. Go ahead and snicker.

Working Consciously In The Dish Room

Rack. Rinse. Wash. Stack.

I recently spent some time working as a dishwasher at a camp. No, I wasn’t sampling a possible new career. Truth be told, I was volunteering, and I enjoyed the work.

If that raised the brow over your mind’s eye, you’re not alone. Telling folks that I enjoyed this work raised any number of brows. To the naive observer, washing dishes doesn’t look like a very glamorous occupation. A closer inspection reveals just how very correct this naive observation is. Yet I drew myself into the task through working consciously—a term that I previously used in How To Make Mistakes, but never fully defined.

Working consciously makes mindful work of the mindless. It involves constantly challenging oneself to be more effective. It requires continuous awareness of the work, an ongoing search for better approaches to the task, and the willingness to try different methods and compare the results. It means, in short, attaining satisfaction by not being satisfied.

But I still haven’t defined it very well. Perhaps this example will help.

Rack. Rinse. Wash. Stack.

At its most basic, the dishwashing process comprises four tasks:

  • Dirty dishes are placed into racks.
  • The racks are sprayed, removing any excess bits of food.
  • The racks are run through an industrial-style dishwashing machine (the Hobart).
  • After coming out of the Hobart, the newly clean dishes are stacked on their shelves.

When dirty dishes are coming at a fast and furious pace, we need to keep the pipeline flowing, or we’d get overrun with dirty dishes. Rack. Rinse. Wash. Stack. The racks with pegs are for plates and bowls; there’s another kind for beverage cups; flats are for utensils and various other implements of mass consumption.

Loading each rack to its capacity allows for fewer runs of the Hobart, saving water and energy. But fully loading the racks requires time and attention; too much of either creates a bottleneck. So there’s a bit of a Tetris-like challenge in loading the racks, except that the game doesn’t end just because you’re too stacked up.

The Hobart cycle takes two minutes. If the racks aren’t packed well, then the queue of racks going into the Hobart gets stretched out, making the Hobart another potential bottleneck. Yet the Hobart runs for only two minutes, not enough time to thoroughly scour dirty dishes. Hence we pre-rinse so that the dishes don’t come out of the Hobart needing another run through it (or maybe two). However, more thorough rinsing may require more time and more hot water. And sometimes a very quick scrub works better than a rinse, saving water—but scrubbing can be slower than rinsing.

Working consciously helped me to recognize these constraints; it also helped me consider and test various ways to manage them. As one example, I tried varying the orientation of the dishes in the racks; I saw that an edge-on angle let me rinse them more effectively. In the same vein, I sought other ways to minimize how much water I used when rinsing, and to recognize when the better tool was the scrub sponge and not the sprayer.

Through working consciously, I turned what could have been dull work into a win for all: Our diners had clean dishes; the task was done more efficiently; and I gained the satisfaction of seeing my work improving (along with my Tetris game). Plus, I turned a dishwashing gig into a blog post.

Even a dish room can offer interesting challenges, if you remember to look for them.

Rack. Rinse. Wash. Stack.

Filling The Cracks

The latest posted version of SyntaxHighlighter is here. For the change history, or to download older versions from this site, see the downloads page.
Mind The Gap!

SyntaxHighlighter can display line numbers on the left side of the code block, in an area called the gutter. As a result of other changes that I’d made, the gutter sometimes showed gaps between the line number elements. I added some JavaScript to resize the number elements so that they’d be contiguous, closing any gaps. However, that code needs the offsetHeight for each line in the code block. As I then discovered, when JavaScript so much as reads the offsetHeight of an element, it’s likely that the browser will need to reflow the page to calculate the value. Repeat that for enough lines, and you’ve got a bit of a performance issue (or I do, depending on how you look at it.)

At one point, I’d thought that I could bypass the height adjustment for the most common cases (e.g., text lines that are short enough not to wrap.) But life and HTML come with few guarantees; testing showed that some browsers, in some environments, would still produce some 1-pixel gaps like so:

Gutter, on the left, showing unwanted white lines.

Gutter, left, showing unwanted white lines. (The text on the right is blurred; do not adjust your eyes.)

(The gaps appeared below lines that had keywords highlighted in boldface fonts. Depending on the fonts that the browser is using, the boldface can sometimes add a pixel to the height of the text line.)

Sometimes the best way to solve a problem is to make it no longer be a problem. I’d caused the gaps to appear by changing how backgrounds are rendered. In the original SyntaxHighlighter, the number elements are transparent; the gutter’s background is the background of the top-level <div> for the entire code block. There may be gaps between the number elements, but those gaps can’t be seen. I couldn’t easily go back to that approach, but thinking about it inspired a new one: By stretching the first gutter element vertically, it could be used to provide the background for the gutter. The remaining gutter elements would then be laid out over that first one. Here’s an illustration:

The first number element fills the gutter vertically; the remaining elements are placed on top of it, leaving no visible gaps.

The first number element fills the gutter vertically; the remaining elements are placed on top of it, leaving no visible gaps.

The gaps are invisible once more. The associated JavaScript still causes some reflow, but no more than five times per code block, a much smaller overhead in most cases.

Dumping Hack Flash, It’s a Class, Class, Class

The latest posted version of SyntaxHighlighter is here. For the change history, or to download older versions from this site, see the downloads page.

Well, that was… interesting.

I’ve been working on improvements in SyntaxHighlighter‘s whitespace handling. My goal was to have whitespace presented exactly as it’s supplied within the original HTML. I thought this would be helpful on two fronts: First, it might fix a few minor display formatting issues that I’d found. Second, and of greater concern to me, I figured that if SyntaxHighlighter preserved whitespace well enough, then it would be OK to remove the Flash-based “Copy” applet from the toolbar. (I’m not sure, but my guess is that the Flash app was added because copying the code to the clipboard has been so problematic.)

I’ll admit that its removal is a mixed blessing: It’s not like it’s a bad thing, to be able to copy the code to the clipboard with one click of the mouse. But the Flash applet carries some baggage, too: It requires Flash (well, yeah); it adds to the overall footprint consumed by SyntaxHighlighter (especially when using Internet Explorer); and it doesn’t always work.

I don’t have a lot of details on that last one; then again, I haven’t sought details. But the problem could simply be that Flash 10 added new restrictions on the application’s access to the Clipboard, thereby breaking it for SyntaxHighlighter.

Maybe, maybe not. But if the traditional select-and-copy operation worked as expected, without loss of newlines or other whitespace, would there really be any further need to keep the Flash script? I didn’t think so either.

I think I can hear some of you snickering. No, I hadn’t ever gone down this road before. So, I’ve discovered that perfect whitespace preservation in HTML is not easy. It may not even be possible, even setting tab characters aside.

I’d hoped to come up with a broadly cross-browser-compatible scheme for preserving whitespace in the display and in the paste buffer. What I have is a combination of JavaScript and CSS classes that fake it pretty well (I think.) The key, of course, lies in using the CSS whitespace="pre" and whitespace="pre-wrap" directives in the appropriate classes. JavaScript is needed, among other things, to ensure that line breaks come through correctly. And yes, I did have to break into browser-specfic code at times.

Surprising things can happen to space characters ‘twixt the HTTP stack and the paste buffer. Throw in newlines and tabs (On second thought: no, don’t throw in tabs; throw them out), and life gets very interesting indeed.

While the resulting release doesn’t quite fit the criteria for being WYSIWYG, I’d say it does at least satisfy YCAGWYWBIYTSYJMFYGWYN. Tab characters are converted to spaces, and there are often one or two spaces added at the ends of lines. I’m guessing that this will be close enough for most uses. For those cases where exactness is required, there’s still the toolbar command for view the original source as plain text. The popped-up text preserves the original whitespace, and a “copy” works just fine there. Putting this all together, it did seem to me that the Flash app could now be dropped, and so I’ve done so.

I’ve also added some tentative support for titles. Unfortunately, I can’t show an example in this post, because the WordPress Plugin doesn’t support the option yet; but you can see an example here. (I’d thought about adding this previously, then discarded it as creeping featurism, and then re-considered it when I realized that it might be of use for the visually disabled.)

And there are a lot of changes under the hood; I’ll write about these in separate posts.