Forget that whole P=NP? nonsense. The real greatest open problem in computing science is paging. That's right: good old record paging directly in the database as well as at higher levels of abstraction.
A Trie implementation in Javascript
| 16 December 2011A trie (usually pronounced like "tree") is a data structure that has some very interesting properties. Normally, and in the implementation below, you'd use a trie in much the same way that you'd use a hash table: to store a value (any type of object) that can later be looked up by its key (a string). For example:
var t = new Trie(); //create a trie
t.add("cole", 1); //add a key/value pair
var x = t.find("cole"); //x == 1
The usage pattern above should look very familiar if you've ever used a hash table. Before we get to the peculiarities of a trie, let's look at how it works.
Beware the StringBuilder's Default Capacity!
| 20 October 2011The StringBuilder class is a wonderful tool to improve performance when doing repetitive string concatenations. However, there is one small detail that seemingly everyone overlooks when recommending StringBuilder. In fact, I've just been burnt by this: the default capacity of a new StringBuilder instance is 16 bytes!
That is ridiculously small when one is planning to concatenate strings many times over. If you happen to be creating large CSV files for users to download - like I was - then you may encounter an OutOfMemoryException when using the StringBuilder.Append() method. This happens because the string builder periodically increases the size of its buffer to hold more data, and eventually it cannot get a large enough contiguous block of free memory.
However, if you know that you'll need a large buffer, and set the capacity of the StringBuilder using one of the constructor overloads, you can avoid wasting a bunch of memory on repeated buffer re-allocation during the early lifetime of your StringBuilder instance.
With all the people out there who recommend using StringBuilder, why have none of them (myself included, until now) actually read the docs and mentioned this when making their recommendation?
Mimicking Erlang/OTP in C#
| 04 October 2011Concurrent and parallel programming (for discussion purposes, I'm going to stick with "concurrent") are really hard in C# (or any of the other C-flavoured languages). Anyone who disputes this is either:
- lying, or
- blissfully unaware of the alternatives
This is one area where the C crowd really got things wrong decades ago and we've been suffering with all of their horrible threads, locks, and semaphores ever since. Fortunately, the guys who designed Erlang and OTP realized this (other fringe groups figured it out too, but none have really gained much traction) long ago, and chose the much more sensible Actor Model for concurrency in their system. The actor model is brilliant, and on the rare occasions when I actually get to build something in Erlang, it still blows me away just how easy it is to do concurrent and parallel computations. Unfortunately, most of my work is in C#, in which building concurrent software feels, I imagine, very much like a root canal. Except that the pain goes on for months at a time.
Speeding up a query with a WHERE NOT EXISTS clause
| 21 July 2011If you have a query with a NOT EXISTS ( SELECT ... ) predicate in the WHERE clause, consider refactoring it to use a LEFT OUTER JOIN and a predicate to check for a null column value.
Today I was able to reduce the cost of a query by an order of magnitude by refactoring a query like this:
SELECT ... FROM t1 ... WHERE NOT EXISTS ( SELECT t1_fk FROM t2 WHERE t2.t1_fk = t1.pk UNION SELECT t1_fk FROM t3 WHERE t3.t1_fk = t1.pk )
To something like this:
SELECT ... FROM t1 LEFT OUTER JOIN t2 ON t2.t1_fk = t1.pk LEFT OUTER JOIN t3 ON t3.t1_fk = t1.pk WHERE t2.t1_fk IS NULL AND t3.t1_fk IS NULL
If you find yourself thinking "how can I speed up this query if it's got a WHERE NOT EXISTS in it?" consider spending a few minutes to see if the LEFT JOIN approach improves the query plan.