Archive

Author Archive

Robotics Studio – Concurrency and Coordination Runtime (CCR) – Part 1 of 3

July 26th, 2010

Disclaimer: Since it’s relatively new concept to digest, I’m gonna go slow and might repeat!

Well, if you were a computer science student like me, I’m sure you would have your own taste for Robots from the college/university you graduated to build world’s greatest(!) robot. I’ve tried hands-on to create few basic robots, sleep with those Fuzzy logic books as pillows, travel from library to library for more books on robotics, Butter up dad for the extra pocket money to buy robotics toolkits, quarrel with your friend in mid-night when everything breaks up and miraculously rebuild everything in shape at about 4 AM in the morning before the demo of the project that day :) well this was about a decade back… Everything around robotics world was little primitive but things have evolved now!

One such evolution is Microsoft Robotics Studio. More info at http://www.microsoft.com/robotics

robotics

The robotics studio helps you to create and control industrial robots or pretty much any kind of robot. Well, surprisingly looks like I still have the some quest  that i had 10 years back and in the process of building a new robot using this. I will write a little more on this shortly.

But before that i would write about getting basics correct and some new thoughts as i started learning this framework in this multi-part article.

The objective of this article is to put new ideas/concept in your mind, to make you think about a fresh/new approach for building High Performance Computing & Distributed Software Solutions using Robotics.

Wait a minute… Robotics && (HPC + Distributed Computing) ? Do they really coerce? Don’t they sound two different opposite poles? Well, the truth is the Robotics World has evolved silently in parallel to the commercial software industry and has numerous technical inventions which are not being used in the main-stream commercial software computing.  

The objective of this article is focused on something interesting, which is already creating ripples around the industry through its 2 new technological inventions - Concurrency and Coordination Runtime (CCR) and Decentralized Software Services (DSS).

Well, to begin with CCR was initially built for robots because robots are expected to respond for *Millions* of concurrent behaviors like touch, sound, color detection, visibility, motion sensors et al and CCR was built to cater these needs. But, once this framework was built, the inventors realized the real potential of this framework. That is, the CCR also addresses the need of mainstream software especially the High Performance computing needed by modern day web software’s, distributed systems and service-oriented applications to manage asynchronous operations, deal with concurrency, exploit parallel hardware and deal with partial failure. CCR is complemented with Decentralized Software Services (DSS) a lightweight .NET-based runtime environment that sits on top of the Concurrency and Coordination Runtime.

Decentralized Software Services (DSS) provides a lightweight, state-oriented service model that combines the notion of representational state transfer with a system-level approach for building high-performance, scalable applications.

Well, for those of you who do not like the definitions – here is the quick synopsis

CCR = Programming model for Multi-threading + inter-task synchronization i.e. think about some one magically removing those complex multi thread management code from your programs yet it runs perfectly multi-threaded! i.e. think about multi-threading without mutex, semaphores, deadlocks… does this ring the bell somewhere? read on..

DSS = Framework that allows to run your Services anywhere on the network.

Alright, by now if you would have questioned HOW? the answer is that we will be doing threading without Windows threading model. A Brand new threading infrastructure that enables multiple tasks to execute concurrently on a single computer.

And… how about executing these multiple tasks on completely separate computers and manage it centrally?

In short with CCR + DSS we will see linear scaling on high-volume data and high-transactional software’s in industries like Medicine, financial trading, scientific modeling, messaging systems or even increased scaling and concurrency on the enterprise search servers. This is possible because CCR manages asynchronous operations, exploit parallel hardware, deal concurrency and handle partial failure.

Figure 1: CCR Architecture

CCR

Now… Let’s get bit technical…

In Figure 1 I’ve tried to depict the architecture of CCR as I understand. The Dispatcher depicted in the end is the real critical piece which provides stunning scalability & maximizes concurrency in the application, rest of them are the object model that support the Dispatcher.

To begin with, the Portset is a FIFO queue of a Port. Port can be any valid CLR Type. The receiver manages it’s own queue and is code guarded by an “Arbiter”. The Arbiter is more like a gate keeper which actually filters the message and executing appropriate messages.

The dispatcher queue is the real load-balancer and scheduler of tasks, they also manage the data, execution and resources of these tasks.

The Boss of everything here is Dispatcher. The CLR Thread pool is a single and process wide execution resource. But, CCR allows *multiple* completely isolated pools of OS threads and completely *abstract* the complications of thread management from the programmer. i.e. you don’t have to worry about deadlocks, mutex et al anymore. 

also, CCR allows to share a single dispatcher between hundreds of software components from heterogeneous networks, which can *load balance* thousands of tasks using a very handful number of threads.

Let’s explore DSS in-depth and a real time code example in Part 2 and Part 3

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Logu Krishnan Robotics

Parallel Programming in C# 4.0 using Visual Studio 2010

June 7th, 2009

 

Downloaded Visual Studio 2010 Beta 1 yesterday and as I was glancing through it strike to me that this version is stuffed, unless the previous two predecessors VS2005 & VS2008.
Framework has been more enhanced and visual studio IDE itself have got overhauled a bit. Well, I’m not going to give you a list of ALL features – it’s been blogged already around the world. Better Google it or Bing it with “VS2010+Features”

However, few notable features that caught my eyes are “Parallel Programming”, “F# - Functional Programming”, “Velocity – Distributed Caching”, “Azure Tools” and more important of all the evolving Team system.

But I first wanted to dirt my hand with Parallel Computing, because if you are a computer science student – well, you would be more excited about this than others.
Remember the big pillow sized books that we used to read to make this work? Well, things have changed and world have shrunk already. Though I cannot explain all the nitty gritty of parallel programming I will try this to explain in LAY MAN Terms.

Well, during the Stone Age [!] - Most of the computers in the world had only ONE Processors, except those big beasty servers which are always locked up in rooms with high security (well, usually *nix or Solaris servers) – these beasty servers used to manage most of the corporations. These servers had multiple processors and it took huge efforts to write software’s and manage them.

Welcome to the modern world – Every household and every laptop being sold these days at least have two or more processors.

Now – that has posed us a BIG Question? Hardwares have evolved, but has our software evolved to execute on multiple processors? – The answer is NO. At least not in the mainstream programming world – let’s say for example what would happen

  1. If we execute a simple FOR Loop
  2. That would call a service (that takes a longer time)
  3. … and execute sequentially for N Times

On a single processor this is acceptable and we might use threads to increase the efficiency.

Is this still acceptable on a multiple processors? The answer is no. Fine, but how do we get efficiency without the hurdles of running and managing too many threads? Shouldn’t there be an easier way out for this?

Alrighty, without much ado, let me show you how easy(!) this is and a little insight on what happens behind the scenes. Let’s churn out a quick code here based on the same questions we have. Let us say a real long process (Well it could be about counting the stars in the UniverseJ, huh) and let us say you want to do this N times.

In our quest to count all the stars in the universe, let’s first create a data structure for the star and add to the universe, and let us use the good ol` mother of all loops the “FOR” Loop, and see how much inefficient this loop has become these modern days!!

 

“The Sequential execution took almost 30 seconds in my Dual Core Computer.”

And here is the Parallel Computing version of the same method. Yes, the for loop has been replaced with Parallel.For a new entry in System.Threading namespace.
How simpler can this get to?

VOILA! The Parallel execution took Just 3 Seconds in my Dual Core Computer.

Well, That’s a significant performance improvement without Hardware Scale-out or Scale-up, all we are doing is using the existing hardware resource efficiently. So much to a FOR Loop J, Huh. 30 Seconds of execution have become 3 seconds instantly. Look closer to the screenshot – the stars are not counted sequentially, instead it allocates the task to the available CPU in parallel.

Because the loop is run in parallel, each iteration is scheduled and run individually on whatever core is available. This means that the list is not necessarily processed in order, which can drastically impact your code. You should design your code so that each iteration of the loop is completely independent from the others. Any single iteration should not rely on another in order to complete correctly.

Let us catch up more on the insights soon on next part of the same series…

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Logu Krishnan C#

Developing Managed Event Sinks/Hooks for Exchange Server Store using C#

March 28th, 2009

One of my previous projects involved me to create a Managed Event Sink for Microsoft Exchange Server Store. Being the first attempt on the topic it took a while to grasp and crack the event sinks – surprisingly googling did not help either, but finally when I cracked, I thought I shall share it to the world for common goodness :-)

Hence, this Article…

So, what does an Event Sink Really mean?

Event Sink is a piece of code that gets triggered on predetermined events. A more classic jargon I can give as an example is “Hooks”, i.e. we hook to an event and when the event occurs our custom code executes first and later the control is passed back to the original event if required. Similarly, we could hook to a mailbox of anyone on the exchange server and could execute the hacked hook even before the exchange server events are fired. This gives us to build a series of LOB applications.

Exchange Store Events

Some of the events that can be hooked to Exchange Server are

1. Synchronous Events – Events that get triggered before an item [Mail, appointments, documents, tasks etc] is committed to the exchange server. These events pauses the exchange store thread until the event sink finishes executing. No other process can access the item during this event sink execution period as, event sink has the exclusive control over the items. Following are the events that are classified as Synchronous events.

a. OnSyncSave – fires when the item is saved to exchange, but before the changes are committed.
b. OnSyncDelete – fires when the item is deleted from exchange, but before the delete operation is committed.

2. Asynchronous Events – Events that get fired after an item is committed to the exchange server. These Async events do not block the exchange store thread. Following are the Asynchronous events.

a. OnSave – Fires after the item is saved to exchange and changes are committed
b. OnDelete – Fires after the item is deleted from the exchange and changes are committed.

3. System Events– Events that get fired based on some system wide actions on exchange server, the following are the system events.

a. OnMDBStartUp – This fires up when the Exchange Database is started.
b. OnMDBShutdown – This fires up when the Exchange Database is shut down.
c. OnTimer – Executes a piece of code at predefined intervals. This is a very useful event, which runs irrespective of specific events.

Synchronous and Asynchronous events are tied to a specific item or folder in the exchange store.

All these events are exposed in the Exchange CDOEX library [cdoex.dll] as interfaces. Fig 1.1 shows the object browser window of the CDOEX library.

So What? What Can I Build?

Some of the applications that can be developed using Event Sink are,

  1. Notification Subsystems
  2. Global Timer applications
  3. Workflow based applications
  4. Automatic Categorization subsystems
  5. Store maintenance for administrators

Let’s Code Now…

Fire up your Visual Studio.NET and choose new C# Class library project and name the project, hmm… let’s call it as “MyEventSink”.

On the Solution explorer, right click the project name and choose Properties, on the Project Properties page choose configuration properties choose build and set Register for COM Interop to
True.

Now, Copy the below files to the MyEventSink bin directory

  • exoledb.dll from exchange server bin directory (\program files\exchsrvr\bin)
  • cdoex.dll - \program files\common files\Microsoft Shared\CDO
  • msado15.dll - \Program Files\Common Files\System\ADO

Open up the VS.NET Command Prompt and navigate to MyEventSink bin folder, and create strong name keys for the above libraries. Key-in the following commands

> Sn –k exoledb.key
> Sn –k cdoex.key
> Sn –k msado.key

We need to create interop assemblies of the above library, in order to, create the interop assemblies we shall use the tlbimp tool. Key-in the following commands to create 3 interop assemblies.

tlbimp exoledb.dll /keyfile:exoledb.key /out:interop.exoledb.dll /namespace:CDO
tlbimp cdoex.dll /keyfile:cdoex.key /out:interop.cdoex.dll /namespace:CDO
tlbimp msado15.dll /keyfile:msado.key /out:interop.adodb.dll /namespace:ADODB

Copy these interop dll files to the debug folder. Switch back to VS.NET and add references to the above created interop DLL files. Modify the following attributes on the AssemblyInfo.cs

Under General Information section, modify

[assembly: AssemblyTitle(”MyEventSink”)]

[assembly: AssemblyDescription(”My Event Sink - Logu”)]

at version information section, create a new GUID and add

[assembly: Guid(”44E6847A-0012-42af-A317-1E1A9F0C853D”)]

[Tip: You can create a new GUID by clicking Tools->Create GUID]

at sign information section, modify

[assembly: AssemblyDelaySign(false)]

[assembly: AssemblyKeyFile(”MyEventSink.key”)]

[assembly: AssemblyKeyName(”MyEventSink”)]

Now, Choose Project Properties and set the “Wrapper assembly key file” to MyEventSink.key and “Wrapper assembly Key Name” to “My Event Sink”

Start the VS.NET Command Prompt and change directory to your project directory, and create a key, key-in the following,

> sn –k MyEventSink.key

Switch back to VS.NET IDE, and change the file name of class1.cs to a new name like “ExchEventSink.cs“, double click the .cs file to open.

Add,

Modify the class definition code to resemble like below,

If you observe the above code, you can notice that we are implementing the IExStoreAsyncEvents interface, which implements the asynchronous events methods namely onsave and ondelete. We shall implement the same now, add the following to your code [check the attached zip file for more information]

In the above code, we are processing an exchange item on onsave method, and we create a LOG file. This is a simple code example; modify it to your requirements.

Compile the class, you have your event sink component ready. Now, Open Component Services, under COM+ applications create new empty application and name it as “MyEventSink”, then, expand, components under MyEventSink and click “import components that are already registered”

And choose “MyEventSink.ExchEventSink” from the populated list.

Now, the event sink component is registered to the server.

We are done on our development part. Now, you can bind the component to any folder of exchange store, there are multiple ways to do this, I prefer the following,

RegEvent.vbs - I’ve attached the VBS file along with the download zip, this script creates the event registration for the specified folder. The following command binds the event sink to my inbox folder,

I’ve included the vbs file along with the zip file.

Exchange Explorer – this is a tool you get with Exchange SDK

Alternatively, you can build your own event registration [that’s a separate article by itself :-) ]

At last, we are done… We have created our own Managed Exchange Store Event Sink. You can also implement the Synchronous Events and the System Events as same as we have implemented the Asynchronous events.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Logu Krishnan C#, Exchange Server, Microsoft , , ,

WMI and .NET Performance Hiccups : Win32 – the Savior

March 26th, 2009

… or read the title as – How could you speed up your software by 90% - There is always a way out there for tuning performance… this blogs is about one such instance where I dumped WMI (Windows Management Instrumentation) and turned back to Win32 just for performance gains.

Sometime back I was involved in a project which involved lots of hardware interfaces like interacting with huge SCSI Devices, Parallel Ports, Digital Imaging et al. Though I was a fan of Win32 I was thrilled with how WMI does wonders to reduce the development time drastically primarily because of its matured API’s. I was thinking about how difficult it would be for beginner-intermediate programmers to work on system level programming using C, C++ to interact with hardware, network devices, communication devices et al… and how error prone those codes are… WMI is definitely best in this case.

And many times WMI has always saved our development time. But there were moments where we happened to hit with lots of performance hits during our product testing. While probing out the reasons, it was really surprising to see the reasons of performance hiccups. Sometimes it was development team’s oversight, sometimes poor WMI was the culprit.

Here I’ve given 2 instances which gave us a considerable amount of performance boost for my product.

Handling System.Drawing.Image Performance using C#

The Product I was working on previously had to load & Edit JPEG Images, which were of Digital Quality, which means the Image size would be greater than 3-5 MB. My customer has been always complaining about image loading speed, everything was fine while loading a 1-2 MB Image, but things started to change drastically whenever a 3 MB or greater image is loaded, it took around 1-2 sec to load a image. 2 sec is fine for normal applications, but not for people who work with thousands of images per day, the Windows Form would almost hang before loading up the image. After a considerable amount of research I found out that the real culprit, which caused the bottleneck is the line “System.Drawing.Image.FromFile()”

I happened to hit a KB article which confirms this issue. And had a hotfix[!!], which updates
1. System.Windows.Forms.dll
2. System.Design.dll
3. System.Drawing.dll

Infact there was an interesting new signatures under System.Drawing.Imaging System.Drawing.Image.FromStream(Stream stream, bool useICM, bool validateImageData) This bool validateImageData was the real cause for the image being slowed down, which validated the content of the image file before loading up. So as size of the image increased, the loading time increased exponentially.

So I had to lookout for an alternate. The obvious choice was Win32. and here is the method equivalent to Image.FromFile()

public static Image Win32ImageFromFile(string filename)
{
    filename = Path.GetFullPath(filename);
    IntPtr loadingImage = IntPtr.Zero;
 
    if (GdipLoadImageFromFile(filename, out loadingImage) != 0)
    {
        throw new Exception(”Oops! GDI Exception.”);
    }
    return (Bitmap)imageType.InvokeMember(”FromGDIplus”,BindingFlags.NonPublic | BindingFlags.Static| BindingFlags.InvokeMethod,
    null
, null, new object[] { loadingImage });
}
 

And now, when I used this new method… voila! the images started loading up atleast 90% Faster and took less than 10Millisecond to load! Wow! That was really great and amazing.Handling Performance issues in Win32_LogicalDisk using C#

Here is another instance, where I dumped WMI and used Win32 Instead.
Here is the simple WMI code which would list the removable drives of the computer.

# region “WMI Code to retrieve Drives”

ManagementClass driveClass = new ManagementClass(”Win32_LogicalDisk”);
ManagementObjectCollection drives = driveClass.GetInstances();
StringCollection driveCollection = new StringCollection();
try
{
   
foreach (ManagementObject drv in drives)
{
//Check is made to find whether the drive is from removable storage device
if ((drv[”Description”].ToString()==”Removable Disk”) && (drv[”DriveType”].ToString()==”2″))
{
    driveCollection.Add(drv[”Caption”].ToString());
}
} }
# endregion 
This code would take a minimum of 4-5 seconds to enumerate my disk drives. And another problem is that, every time the floppy drive is also physically checked[but… why!!], which further slows down the execution time. None of our clients would accept this, when this feature is used frequently. There was no way to solve this issue except to lookout for a Win32 Method, and here is the alternative… Using Win32

# region “WIN32 Code to retrieve Drives”
[System.Runtime.InteropServices.DllImport(”kernel32.dll”, SetLastError=true)]static extern uint GetDriveType(string lpRootPathName);
/* Retrieves All the Mounted Drives on the computer. */
string[] _drives = System.Environment.GetLogicalDrives();

foreach(string _drive in _drives)
{
    /* Call Win32 GetDriveType to determine the Drive Type,based on the Drive Letter */

//Check whether the passed Drive is a Removable Disk Type     
_driveTypeLength
= GetDriveType(_drive);    
if
(_driveTypeLength == 2 || _driveTypeLength == 5)
    {
         driveCollection.Add(_drive);
    }
}
# endregion

This code executed in less than 100 MilliSeconds !!! That was an incredible performance boost.
 Do you think using Win32 as an alternative is insane? Have you faced such realtime problems? Would you still use WMI? Talk back!

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Logu Krishnan C#, Performance , , , ,

Back to Basics: Performance Killer Code – Unaligned Memory in 32-Bit for C# Struct

March 5th, 2009

Recently I was analyzing a .NET Application for performance which had lots of structs defined in it, and happened to hit a strange reality. Unaligned Memory problem! I was running a profiler, and found that the memory allocated for few structs are huge than it should normally allocate (based on my own math). When I probed further, there was an interesting discovery. Read on…

Let’s get back to basics…
Alright here is a little head spinner… What is the difference between the following structures?


struct BadStructure
{
char c1;
int i;
char c2;
}struct GoodStructure
{
int i;
char c1;
char c2;
}

Nothing much, except the jumbled type declarations… Huh?

Fine, Now let’s look at the size of these structures,

The size of BadStructure Structure in:
.NET Framework 3.5 : Managed sizeof= 12 Bytes, Marshal.Sizeof = 12 Bytes
The size of GoodStructure Structure in:
.NET Framework 3.5 : Managed sizeof= 8 Bytes, Marshal.Sizeof = 8 Bytes
[Note: Size of int=4, char=2]
The Reason behind these differences is “BYTE ALIGNMENT”, As with the default packing in unmanaged C++, integers are laid out on four-byte boundaries, so while the first
character uses two bytes (a char in managed code is a Unicode character, thus occupying two bytes), the integer moves up to the next 4-byte boundary, and the second character uses the subsequent 2 bytes. The resulting structure is 12 bytes when measured with Marshal.SizeOf.32 bit microprocessors typically organize memory as shown below.
                  Byte0  Byte1  Byte2 Byte3
0×1000
0×1004     A0        A1        A2      A3
0×1008
0×100C                 B0         B1      B2
0×1010     B3

Most of the processer architectures cannot read data from odd addresses.
Processor Architectures are inefficient in reading the data if it starts at an address not divisible by four.

Memory is accessed by performing 32 bit bus cycles. 32 bit bus cycles can however be performed at addresses that are divisible by 4. So for efficiency purposes, compilers add the so-called pad bytes. The reasons for not permitting misaligned long word reads and writes are not difficult to see. For example, an aligned long word A would be written as A0, A1, A2 and A3.

Thus the microprocessor can read the complete long word in a single bus cycle. If the same microprocessor now attempts to access a long word at address 0×100D, it will have to read bytes B0, B1, B2 and B3. Notice that this read cannot be performed in a single 32 bit bus cycle. The microprocessor will have to issue two different reads at address 0×100C and 0×1010 to read the complete long word. Thus it takes twice the time to read a misaligned long word.

The following byte padding rules will generally work with most 32 bit processor.

a. single byte numbers can be aligned at any address
b. Two byte numbers should be aligned to a two byte boundary
c. Four byte numbers should be aligned to a four byte boundary

This is the cause of the difference.

Fine…. How do we fix this ?

The .NET compilers apply a StructLayoutAttribute to structures, specifying a Sequential layout. This means that the fields are laid out in the type according to their order in the source file.

Here is the IL for Bad Structure.

.class nested private sequential ansi sealed beforefieldinit BadStructure extends [mscorlib]System.ValueType
{
.field public char c1
.field public char c2
.field public int32 i
}
In the .NET Framework 3.5, the JIT does enforce a Sequential layout (if specified) for the managed layout of value types,We can use the System.Runtime.InteropServices namespace and the StructLayoutAttribute class to control the physical layout of the data fields in the Microsoft .NET Framework 3.5. So Fix is to specify [StructLayout(LayoutKind.Sequential, Pack = 1)] for the struct.Watchout for structures when you create them next time, and think about playing around with ‘m’ structures with ‘n’ size…. m x n = !!! You can definitely save few Kilo Bytes of load or worst case if you are using structs heavily for Data transformation you might even save few Mega bytes. Alright, Time to re-factor your code now :)

Happy Coding!

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Logu Krishnan C#, Performance , , ,

ADITI Blogs

January 12th, 2008

ADITI Blogs – A New initiative from ADITI Technologies – The software product company powering world class products. Now, technical fellows behind the scenes are planning to share their thoughts with the rest of the world. Well, we are not here to bother the world by Me-Too Blogging or throw ironic comments on rest of the world – rather we are planning to talk about that Nitty Gritty technical things that inspire us and could possibly inspire rest of the world for common goodness.

The objective is simple – we are trying to add more common sense to technology and build mature software’s. During this course we would like to share what we have learnt and hear back from rest of the world.

“…Thus… ADITI Blogs is born…”

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Logu Krishnan Life at Aditi