Sunday, October 12, 2008

Manycores and the Future

Back in June, Microsoft released a preview of the .Net 4.0 framework for .Net 3.5 called "Parallel Extensions". A huge advancement in the framework is a realization that, with the huge take up of manycore systems, many programmers are not taking advantage of the available resources.

For many programmers threading is either 1) a complex, unknown world, 2) a scary, once bitten thing, 3) a gift that comes naturally. Concurrency is going to be a main point of enhancement in the new framework. It is the way of structuring an algorithm to make use of otherwise wasted clock-cycles by using threads to accomplish tasks which might take a while with available downtime.

In many cases using threading creates a complex situation. Keeping track of resources and fixing cross threading code is a daunting task. Setting up each thread and setting up resources for the thread can cause some ugly code and draw the programmer to focus more on operating system architecture. The extensions gives developers several different tools to allow them to use many cores without having to deal with threads and focus on the algorithm.

So, now thread handling is up to the compiler, concurrency is a walk in the park, right? Not so fast, now that we can focus more on the algorithm, we are still left understanding what is going on under the covers. The developer needs to be able to tell which parts of the algorithm can be done in parallel. Not all situations can benefit from or use parallelized code. There are several design patterns to consider. Is the work to perform mutually exclusive, i.e. the work at one step is independent of all other work done at the same time? Then it is an excellent candidate for being parallelized. Parallelizing code allows developers to take independent, repeated sections of code and run them concurrently.

The Parallel Extensions gives us a couple new looping constructs: Parallel.For, Parallell.For≶>, Parallel.ForEach≶>.

Today I am going to introduce you to Parallel.For. This new loop, takes in an anonymous delegate or in my case a lambda expression and concurrently runs the expression based on the supplied parameters.

When using the Parallel Extensions you must add a reference to the System.Threading assembly (it should be pointing to the GAC'ed assembly from the Parallel Extensions directory). If you forget to add this, you will quickly find that Parallel. doesn't exist.

The example, the Quadratic Equation:
public static class Extensions
{
   /// <summary>
   /// Stopwatch Extension Method:
   /// Starts, executes function over iterations,
   /// Returns the time span.
   /// </summary>
   /// <param name="sw">this</param>
   /// <param name="func">Delegate or lambda to run</param>
   /// <param name="times">Times to run the function</param>
   /// <returns>TimeSpan object of elapsed time</returns>
   public static TimeSpan Time(this Stopwatch sw,
                               Action func,
                               int times)
   {
       sw.Reset();
       sw.Start();
       for (int idx = 0; idx < times; idx++) { func(); }
       sw.Stop();
       return sw.Elapsed;
   }
}
class Program
{
   static double[] Quad(double[] p)
   {
       double[] x = new double[2];
       double a = p[0], b = p[1], c = p[2];
       x[0] = ((b * -1) + Math.Sqrt(b * b + 4 * a * c))/(2 * a);
       x[1] = ((b * -1) - Math.Sqrt(b * b + 4 * a * c))/(2 * a);

       return x;
   }

   static void Main(string[] args)
   {
       Stopwatch stopwatch = new Stopwatch();
      
       int max = 1000000;
       double[][] numbersP = new double[max][];
       double[][] numbersS = new double[max][];
       double[][] numbers = new double[max][];
      
       Random rand = new System.Random(1042);
       Parallel.For(0, max, x =>
       {
           double[] item = new double[3];
           item[0] = rand.Next(500);
           item[1] = rand.Next(500);
           item[2] = rand.Next(500);
           numbers[x] = item;
       });

       // Synchronous
       TimeSpan ts = stopwatch.Time(() =>
       {
           for (int x = 0; x < max; x++)
           {
               numbersS[x] = Quad(numbers[x]);
           }
       }, 1);
       Console.WriteLine(
           String.Format(
             "Synchronous RunTime: {0:00}:{1:00}:{2:00}.{3:00}",
             ts.Hours, ts.Minutes,
             ts.Seconds, ts.Milliseconds / 10));
   
       // Parallel For
       ts = stopwatch.Time(() =>
       {
           Parallel.For(0, max, x =>
           {
               numbersP[x] = Quad(numbers[x]);
           });
       }, 1);
       Console.WriteLine(
           String.Format(
             "Parallel.For RunTime: {0:00}:{1:00}:{2:00}.{3:00}",
             ts.Hours, ts.Minutes,
             ts.Seconds, ts.Milliseconds / 10));

       Console.Read();
   }
}
So what is the verdict?

Test Results

TypeTest 1
Debug
Test 2
Release
Test 3
Release
Synchronous00:00:01.1100:00:00.9700:00:00.96
Parallel00:00:00.7800:00:00.7500:00:00.74

This is quite a simplistic example involving solving many the quadratic equations for random numbers. The results show that there is a significant improvement in performance without much effort. Further more, we could use this idea to improve a whole host of scenarios like image rendering, validation, and making web requests.

Happy coding!

No comments: