Saturday, October 25, 2008

Why isn't that function just there?... And how to fix that: become a language designer

Have you ever dug for a function you thought should just be there only to find out that it's not? Leaving you to create your own class inheriting the class that should have had the function in the first place. If you are in the middle of a large project and run across something like this, in .Net 2.0, your options are limited:
  • Creating a custom class that inherits the class you want to extend and add your function. This would force you to update all declarations of that type
  • Creating a shared function in a common static class. This can add make for some ugly code that doesn't look very object oriented
I was talking to some people a while ago about this very situation and I brought up the topic of extension methods. They responded with "what's that?" apparently never having heard of them. If they didn't know about them, I figure there are a lot of people out there who might want to be enlightened, thus the reason for this blog.
Extension methods, in .Net 3.0 (or later), add a third option to the list. They allow a function to attatch to all existing objects of a type. This enables anyone to become a language designer and fix all of the holes Microsoft left just to annoy you. Lets take a look at a couple examples.

Suppose a legacy system requires a CSV input and the .Net application contains a list of strings. The .Net 2.0 solution might look something like this:
public static class Common
{
  public static string ToCSV(List<string> list, char delimiter)
  {
    StringBuilder sb = new StringBuilder();
    for (int idx = 0; idx < list.Count; idx++) {
      sb.Append(delimiter); }
      sb.Append(list[idx].ToString());
    }
    return sb.ToString();
  }
}
...
// Then run
...
string csv = Common.ToCSV(<nameoflist>, ',');
This is all well and good; the code creates the expected CSV. Programming in this fashion breaks doesn't look very object oriented which is the goal of the language. If there are many references to this function, wrapping the list class and extending might by a lot of work and could be overkill. Using an extension method will produce an example like this:
public static class Extensions
{
 public static string ToCSV(this List<string> list, char delimiter)
 {
     StringBuilder sb = new StringBuilder();
     for (int idx = 0; idx < list.Count; idx++) {
         sb.Append(delimiter); }
         sb.Append(list[idx].ToString());
     }
     return sb.ToString();
 }
 public static void FromCSV(this List<string> list,
                             string csv, char delimiter)
 {
     list.AddRange(csv.Split(new char[] { delimiter }));
 }
}

class Program
{
 static void Main(string[] args)
 {
     List<string> strings = new List<string>();
     strings.Add("Bob");
     strings.Add("Jane");
     strings.Add("Jack");
     strings.Add("Mike");
     strings.Add("Mary");
     // Call the Extension Method
     string csv = strings.ToCSV(',');
     Console.WriteLine(csv);

     // Lets verify that the opposite way works
     List<string> newlist = new List<string>();
     // Call the Extension Method
     newlist.FromCSV(csv, ',');
     for (int idx = 0; idx < newlist.Count; idx++)
     {
         Console.WriteLine(newlist[idx]);
     }
 }
}
In my previous post, Manycores and the Future, I used an extension method on the Stopwatch class to better handle timing test code.

Extension methods are a powerful tool for creating utility libraries and adding common functionality to built-in or inaccessible object types. Like the adage, with great power comes great responsibility, this tool requires great responsibility. These methods can make maintaining the code more difficult for new maintainers. A developer can change inner workings and/or the output of the method, resulting in a debugging nightmare. Trying to locate the problem can become more difficult if someone assumes that the method is not built-in and the error is not explicitly in the method or at the call.

It would advisable to use extension methods for only generic cases where the method would be used in many different projects.

Now, lets take a section of the previous posts code and extend it to use a new "Quad" extension method. I'll discuss it further in a moment.

public static class Extensions
{
 /// <summary>
 /// Stopwatch Extension Method:
 /// Starts, executes function over iterations
 /// Returns the time span.
 /// </summary>
 public static TimeSpan Time(this Stopwatch sw,
                             Action func,
                             int times)
 {
     sw.Reset();
     sw.Start();
     for (int idx = 0; idx < times; idx++) { func(); }
     sw.Stop();
     return sw.Elapsed;
 }

 public static double[] Quad(this double[] p)
 {
     double[] x = new double[2];
     double a = p[0], b = p[1], c = p[2];
     x[0] = ((b * -1) + Math.Sqrt(b * b + 4 * a * c))/(2 * a);
     x[1] = ((b * -1) - Math.Sqrt(b * b + 4 * a * c))/(2 * a);

     return x;
 }
}

class Program
{
 static void Main(string[] args)
 {
     Stopwatch stopwatch = new Stopwatch();
     int max = 1000000;
     double[][] numbersP = new double[max][];
     double[][] numbers = new double[max][];
  
     Random rand = new System.Random(1042);
     Parallel.For(0, max, x =>
     {
         double[] item = new double[3];
         item[0] = rand.Next(500);
         item[1] = rand.Next(500);
         item[2] = rand.Next(500);
         numbers[x] = item;
     });

     TimeSpan ts = stopwatch.Time(() =>
     {
         Parallel.For(0, max, x =>
         {
             // Make the call to the extension method
             numbersP[x] = numbers[x].Quad();
         });
     }, 1);
     Console.WriteLine(
     String.Format("Parallel.For RunTime: {0:00}:{1:00}:{2:00}.{3:00}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds/10));

     Console.Read();
 }
}
Basically, I just moved the Quad function to the static class (like with the first example) and changed the Quad(numbers[x]) to numbers[x].Quad(). The bonus here is less typing, looks more object oriented, and the "Quad" function appears in the intellisense allowing for easier finding of the functionality needed. Lets assume for a moment that Quad is a very useful function and that we would want to use over and over. We must ask ourselves if we want to have this function available to all developers to have access to this function who create a variable of type double[]. There are many problems with attaching methods to arrays and doing things like this. If someone passes in an array with size greater than 3, it's use is likely not what the caller thinks it is and the caller won't know there is a problem until farther down the execution path. Another problem arises when an array of size less than 3 is passed in. These errors are raised at run time. Compile time errors are always better than run time.

Using Quad in this fashion is definitely not an optimal approach/reason to using extension methods. In this case Quad does not really extend the type double[], therefore one of the other options listed above is definitely better. For example, adding this function to a common math class. This will be the hard part for new developers to understand and adds to my reasoning for using them sparingly and after some diligence.

This will be an important tool for experienced programmers. This has the makings of many sleepless nights for me wondering when I am going to misuse of extension methods like the example above.

No comments: