Orange is my favorite color

There have been a number of people who identified ColdFusion’s string handling (based on unmutable Java strings) to be somewhat slow.

I build large strings for use in custom reports for my app and I wanted to know how CFSAVECONTENT stacks up to other string handling in CF. Here are three ways (ordered by speed, slow to fast) provided by Harel Malka (which were actually created by Niklas Richardson:

<cfscript>
s = "I am a bad string ";
for (i=1; i lte 10000; i=i+1) {
s = s & "from a bad family";
}
</cfscript>

<cfscript>
s = createObject("java", "java.lang.StringBuffer");
for (i=1; i lte 10000; i=i+1) {
s.append("I am a good string from a fast family");
}
</cfscript>

<cfscript>
s = arrayNew(1);
for (i=1; i lte 10000; i=i+1) {
arrayAppend(s, "I am faster than a speeding mullet");
}
newString = arrayToList(s, "");
</cfscript>

Many of the links above are focused on building CSV files which won’t tolerate additional whitespace but I think there may be cases where people concatenate strings when CFSAVECONTENT would work just fine. HTML output is certainly one of them and I wanted to know if switching to one of the above techniques would remove a bottleneck in my app. I tried adding this to the above tests:

<cfsavecontent variable="s">
I am a bad string
<cfloop from="1" to="10000" index="i"> from a bad family</cfloop>
</cfsavecontent>

Here are the results on its first run. Note that the absolute numbers are meaningless because they are particular to my laptop. What we are looking for are the percentage difference between these methods:

string & time: 5401
StringBuffer time: 512
ArrayToList time: 60
CFSaveContent time: 231

On the second run:

string & time: 4367
StringBuffer time: 131
ArrayToList time: 30
CFSaveContent time: 20

After the first run (which may include some compilation issues), the numbers pretty much stay like this where CFSAVECONTENT is as fast as the arrayToList approach. The numbers would vary by 10ms in either direction so my title may be slightly misleading but the last two approaches are essentially equal. My guess is behind the scenes CF is using a StringBuffer or similar approach to growing the string.

This is good news and bad news for me… because that means CFDOCUMENT is the source of my bottleneck and that’s more difficult to fix!

*Update*
Just after publishing this, I thought that because that page was static, perhaps the additional runs were not very representative of the kind of work that would actually happen in these instances. So I changed the tests to create a random string at the beginning of the page:

<cfset string = "this is some randomized #randrange(1,100)# content">

And then use that string in place of the others, e.g.:

s = s & string;
s.append(string);
arrayAppend(s, string);
<cfloop from="1" to="10000" index="i">#string#</cfloop>

The CFSAVECONTENT example was wrapped in CFOUTPUT. The results are now more interesting for first and subsequent runs:

time: 20179
time: 160
time: 31
time: 90

time: 21143
time: 101
time: 20
time: 20

The string concatenation method really starts to fall down as the data being concatenated changes. The others however stay pretty consistent.

4 Comments

  1. Charlie Arehart said:

    on August 24, 2006 at 10:05 am

    Brian, you may have heard this comment from me before (on the Guru list, for instance) but as I have also elaborated in a blog entry:

    http://bluedragon.blog-city.com/fallacy_of_loop_testing.htm

    I just want to urge caution in concluding too much from the value of a change when tested by doing loop testing. As I say in the blog (and elaborate in the comments in response to some who still weren’t convinced), my bottom line point of contention is that if a tweak doesn’t really lead to a huge benefit in a single request, then it may be that if it’s not REALLY used in a loop in a single request, that the loop test conclusions won’t really correlate to an overall improvement in processing over a load test (or real world processing).

    I know it sounds counter-intuitive, and it goes against conventional wisdom (such loop tests are frequently used). I’m just putting out a call to give them serious thought before concluding whether a given technique is REALLY worth changing code to use it. Sure, if you create new code (and the new technique isn’t too obtuse), then surely a penny saved is a penny earned.

    I’m just trying to avoid people making conclusions that would lead folks to feel that HAVE to start changing code just to implement the tweak–or worse, lead folks to ALWAYS decry the lesser approach. No offense intended, of course. Just sharing the observation.

  2. Harel Malka said:

    on August 26, 2006 at 2:44 pm

    CFSaveContent is good to collect a large chunk of string in one go to be placed in a variable. But if you’re inside a loop and need to append a great deal of strings to each other, the arrayAppend method is still your best bet in my opinion.

    First, you cannot use cfsavecontent in a cfscript block. And these blocks DO make a difference both in readablity and performance (they are faster for some strange reason).
    Second, you must use cfoutput and cfoutput causes a great deal of action in the underlying java objects created from your cfm file and generates overhead. Infact, there’s a world of difference between what happens when you use cfoutput and writeoutput().

    One important note to pay attention to: When you are using cfsavecontent and cfoutput your result string will include all whitespaces, line breaks, carriage returns and tabs that exist inside the cfsavecontent. Over a large loop that means that your string will be much larger in size then if you used a more cost effective method like arrayAppend. That leads to larger memory usage. Might sound picky but whitespaces and other control characters ARE characters and they do have purpose and take space.

    CfSaveconent is indeed fast, and the time differences you got in your tests (10-20 ms each way) are negligible differences as there are many factors that these numbers depend on besides just the code. Both methods are probably just as effective as far as processing times go. It comes down to the context you require each solution.

    And last – you can always ignore first run of a modified cfm/cfc file. There is always compilation being done behind the scenes and that has no bearing on the real performance of the script.

    Harel

  3. brian said:

    on August 28, 2006 at 6:27 pm

    Charlie: as we discussed offline, my usage scenario actually is in a loop, so the logic applies despite loop-based testing not really being an ideal test case.

    Harel: You make some good points, a few of which I acknowledge in my post, and I agree with your general principle which is the arrayAppend() method is the fastest “raw” string builder. But the premise I wanted to test was whether or not cfsavecontent was hurting my report generating and it turns out the answer is no. Thanks for replying, your original post was very helpful!

  4. Darryl Lyons said:

    on November 17, 2006 at 2:54 am

    I would urge caution using any of the techniques provided if the end result is indeed going to be a CSV file. I have found that using a Java buffered writer (to file) to be a much more scalable solution than using string concatenation (within a loop). You’re essentially writing directly to the file instead of keeping a string in memory and then writing it to a file.

{ RSS feed for comments on this post}