Please note, new blog at http://www.acheron.org/darryl/

Using Java BufferedWriter to create CSV files

Someone recently asked me to expand upon my previous post about generating CSV files, and in particular, to focus on the Java method. In my previous post, I concluded that using the Java BufferedWriter class is a much more scalable and stable solution than using the typical concatenate-and-write-once-done method.

Step by Step

There really isn't much to using Java to write to a file. Essentially, there are three steps, 1) create the Java class instances, 2) get the data and write to the buffer, and 3) close the output stream.

Firstly, we instantiate the FileWriter class (which writes to the file) and then the BufferedWriter class, passing the FileWriter instance as a parameter.

CFM:
  1. <cfscript>
  2. outputFile = getDirectoryFromPath(GetCurrentTemplatePath()) & "dump.csv";
  3. oFileWriter = CreateObject("java","java.io.FileWriter").init(outputFile,JavaCast("boolean","true"));
  4. oBufferedWriter = CreateObject("java","java.io.BufferedWriter").init(oFileWriter);
  5. </cfscript>

Then, we create the column headers, and loop over the data (query in this case) and write it to the buffer. The Java class manages the size of the buffer internally, so when threshold is reached, it will write to the file.

CFM:
  1. <cfset oBufferedWriter.write("LASTNAME,FIRSTNAME" & chr(13) & chr(10))>
  2. <cfloop query="qData">
  3.    <cfset oBufferedWriter.write(chr(34) & lastname & chr(34) & ",")>
  4.    <cfset oBufferedWriter.write(chr(34) & firstname & chr(34) & chr(13) & chr(10))>
  5. </cfloop>

Lastly, we need to close the output stream, otherwise the file will not be released.

CFM:
  1. <cfset oBufferedWriter.close()>

Full code

CFM:
  1. <cfscript>
  2.    outputFile = getDirectoryFromPath(GetCurrentTemplatePath()) & "dump.csv";
  3.    oFileWriter = CreateObject("java","java.io.FileWriter").init(outputFile,JavaCast("boolean","true"));
  4.    oBufferedWriter = CreateObject("java","java.io.BufferedWriter").init(oFileWriter);
  5. </cfscript>
  6.  
  7. <cfquery datasource="cfartgallery" name="qData">
  8. SELECT *
  9. FROM artists
  10. ORDER BY lastname, firstname
  11. </cfquery>
  12.  
  13. <cftimer type="inline" label="Generate CSV">
  14.    <cfset oBufferedWriter.write("LASTNAME,FIRSTNAME" & chr(13) & chr(10))>
  15.  
  16.    <cfloop query="qData">
  17.  
  18.       <cfset oBufferedWriter.write(chr(34) & lastname & chr(34) & ",")>
  19.       <cfset oBufferedWriter.write(chr(34) & firstname & chr(34) & chr(13) & chr(10))>
  20.  
  21.    </cfloop>
  22.    <cfset oBufferedWriter.close()>
  23. </cftimer>

AJAX Diary: Race conditions and using cflock

Within traditional web applications, it is highly unlikely that you will come across a scenario where a race condition will ocurr in the current session, unless you are accessing shared scopes. However, within an AJAX application, which is more like a frame-based Web site (remember them?), they can occur only too often.

Background

In our CRM application, there is a method that returns a "client ID". This method checks to see if a record exists in the database, and if not, creates it, and then returns the ID. If it already exists, then it simply returns the record. This record is in a tracking table that we use to reference records in an external system. All tables in our schema join to the tracking table, not the external database. Suffice to say, this method is called in a lot of places where client information is accessed.

<cffunction name="getClientID" returntype="string">
<cfset var local = StructNew()>
<cfset local.qClient = getClientID(arguments.account)>

<cfif NOT local.qClient.recordcount>
// Some logic to create a new tracking ID -- insert into db, etc.
</cfif>

<cfreturn local.qClient.ID>
</cffunction>

Now, you would probably state that there should be a scoped or named lock around this code. However, the likelihood of two sessions trying to create this record at the same time is very remote. Furthermore, most locking in the past has been where you are making changes to shared scopes such as SESSION, APPLICATION or SERVER.

Problem

Our new AJAX interface to our CRM has shown us that we need to be more careful, and use locking more judiciously. The scenario is this. The user clicks on a client record in a search result. This in turn loads the client information card. Three seperate, almost simultaneous, requests are sent to the server to retrieve different pieces of data. Each of the requests just happens to call the aforementioned method at some point. Even though the requests themselves are not occurring at the same time, a race condition is still created...

All three requests are now essentially calling the method, let's call it getClient(), at the same time. The database lookup to determine whether or not the record exists are being performed within a millisecond of each other. All three lookups return no records, so the conditional block of code is executed to insert a new record! So, the end result is three records in the database where there should have only be one!

Solution

Now, we pretty much had two options -- change the calls on the client so that the record check is performed once, or put a named lock around the code, so that we force requests to queue for execution.

We went for the later. Changing all of the calls to the method would require significant refactoring of the model, so that just wasn't an option.

<cffunction name="getClient" returntype="string">
<cfset var local = StructNew()>
<cflock name="getClient.getClientID" timeout="10">
<cfset local.qClient = getClientID(arguments.account)>

<cfif NOT local.qClient.recordcount>

<cflock name="getClient.getClientID.insert" timeout="10">
<cfset local.qClient = getClientID(arguments.account)>
<cfif NOT local.qClient.recordcount>
// Some logic to create a new tracking ID -- insert into db, etc.
</cfif>
</cflock>

</cfif>
</cflock>

<cfreturn local.qClient.ID>
</cffunction>

In the example above, I've put a named lock around the query to the database, and the check of the recordcount. If the recordcount is zero, then we perform another lock, and then do the check all over again. The reason for this is that in the time between acquiring the lock and performing the first check, a database record could have been inserted. The double lock technique just minimises the chance of this happening.

So, why was this only an issue now? Well, the old "traditional" user interface was a HTML page. The processing of the single request was top-down. By that I mean that if various elements of the user interface happened to call the getClient() method, they would be synchronously executed, thereby avoiding the scenario I have described.

Some other resources

Yahoo! User Interface Library

Yahoo! have released a DHTML UI library for public consumption. Some pretty cool things in there like a calendar control, drag and drop, event management and DOM tools.

Google acquires Measure Map

The blog statistics product I have been using since December 2005 has been acquired by Google. Measure Map, still in alpha, is a very easy to setup and use system, and I have been more than impressed by it over the last couple of months. Jeffery Veen, from the Measure Map team and Adaptive Path, had this to say on the Google Blog.
Bringing Measure Map to Google is an exciting validation of the user experience work I've been doing with my partners at Adaptive Path for years. By opening up the app to more bloggers through Google, we hope to help even more people become passionate about their blogs.

RSS feeds can be funny - Goodbye Tim

Quite a few blogs have been covering the fact that Tim Buntel is leaving Adobe. I was looking at my MXNA ColdFusion Live Bookmark tonight and I swear I saw these two items in the feed.