Archive

Archive for March, 2009

Developing Managed Event Sinks/Hooks for Exchange Server Store using C#

March 28th, 2009

One of my previous projects involved me to create a Managed Event Sink for Microsoft Exchange Server Store. Being the first attempt on the topic it took a while to grasp and crack the event sinks – surprisingly googling did not help either, but finally when I cracked, I thought I shall share it to the world for common goodness :-)

Hence, this Article…

So, what does an Event Sink Really mean?

Event Sink is a piece of code that gets triggered on predetermined events. A more classic jargon I can give as an example is “Hooks”, i.e. we hook to an event and when the event occurs our custom code executes first and later the control is passed back to the original event if required. Similarly, we could hook to a mailbox of anyone on the exchange server and could execute the hacked hook even before the exchange server events are fired. This gives us to build a series of LOB applications.

Exchange Store Events

Some of the events that can be hooked to Exchange Server are

1. Synchronous Events – Events that get triggered before an item [Mail, appointments, documents, tasks etc] is committed to the exchange server. These events pauses the exchange store thread until the event sink finishes executing. No other process can access the item during this event sink execution period as, event sink has the exclusive control over the items. Following are the events that are classified as Synchronous events.

a. OnSyncSave – fires when the item is saved to exchange, but before the changes are committed.
b. OnSyncDelete – fires when the item is deleted from exchange, but before the delete operation is committed.

2. Asynchronous Events – Events that get fired after an item is committed to the exchange server. These Async events do not block the exchange store thread. Following are the Asynchronous events.

a. OnSave – Fires after the item is saved to exchange and changes are committed
b. OnDelete – Fires after the item is deleted from the exchange and changes are committed.

3. System Events– Events that get fired based on some system wide actions on exchange server, the following are the system events.

a. OnMDBStartUp – This fires up when the Exchange Database is started.
b. OnMDBShutdown – This fires up when the Exchange Database is shut down.
c. OnTimer – Executes a piece of code at predefined intervals. This is a very useful event, which runs irrespective of specific events.

Synchronous and Asynchronous events are tied to a specific item or folder in the exchange store.

All these events are exposed in the Exchange CDOEX library [cdoex.dll] as interfaces. Fig 1.1 shows the object browser window of the CDOEX library.

So What? What Can I Build?

Some of the applications that can be developed using Event Sink are,

  1. Notification Subsystems
  2. Global Timer applications
  3. Workflow based applications
  4. Automatic Categorization subsystems
  5. Store maintenance for administrators

Let’s Code Now…

Fire up your Visual Studio.NET and choose new C# Class library project and name the project, hmm… let’s call it as “MyEventSink”.

On the Solution explorer, right click the project name and choose Properties, on the Project Properties page choose configuration properties choose build and set Register for COM Interop to
True.

Now, Copy the below files to the MyEventSink bin directory

  • exoledb.dll from exchange server bin directory (\program files\exchsrvr\bin)
  • cdoex.dll - \program files\common files\Microsoft Shared\CDO
  • msado15.dll - \Program Files\Common Files\System\ADO

Open up the VS.NET Command Prompt and navigate to MyEventSink bin folder, and create strong name keys for the above libraries. Key-in the following commands

> Sn –k exoledb.key
> Sn –k cdoex.key
> Sn –k msado.key

We need to create interop assemblies of the above library, in order to, create the interop assemblies we shall use the tlbimp tool. Key-in the following commands to create 3 interop assemblies.

tlbimp exoledb.dll /keyfile:exoledb.key /out:interop.exoledb.dll /namespace:CDO
tlbimp cdoex.dll /keyfile:cdoex.key /out:interop.cdoex.dll /namespace:CDO
tlbimp msado15.dll /keyfile:msado.key /out:interop.adodb.dll /namespace:ADODB

Copy these interop dll files to the debug folder. Switch back to VS.NET and add references to the above created interop DLL files. Modify the following attributes on the AssemblyInfo.cs

Under General Information section, modify

[assembly: AssemblyTitle(”MyEventSink”)]

[assembly: AssemblyDescription(”My Event Sink - Logu”)]

at version information section, create a new GUID and add

[assembly: Guid(”44E6847A-0012-42af-A317-1E1A9F0C853D”)]

[Tip: You can create a new GUID by clicking Tools->Create GUID]

at sign information section, modify

[assembly: AssemblyDelaySign(false)]

[assembly: AssemblyKeyFile(”MyEventSink.key”)]

[assembly: AssemblyKeyName(”MyEventSink”)]

Now, Choose Project Properties and set the “Wrapper assembly key file” to MyEventSink.key and “Wrapper assembly Key Name” to “My Event Sink”

Start the VS.NET Command Prompt and change directory to your project directory, and create a key, key-in the following,

> sn –k MyEventSink.key

Switch back to VS.NET IDE, and change the file name of class1.cs to a new name like “ExchEventSink.cs“, double click the .cs file to open.

Add,

Modify the class definition code to resemble like below,

If you observe the above code, you can notice that we are implementing the IExStoreAsyncEvents interface, which implements the asynchronous events methods namely onsave and ondelete. We shall implement the same now, add the following to your code [check the attached zip file for more information]

In the above code, we are processing an exchange item on onsave method, and we create a LOG file. This is a simple code example; modify it to your requirements.

Compile the class, you have your event sink component ready. Now, Open Component Services, under COM+ applications create new empty application and name it as “MyEventSink”, then, expand, components under MyEventSink and click “import components that are already registered”

And choose “MyEventSink.ExchEventSink” from the populated list.

Now, the event sink component is registered to the server.

We are done on our development part. Now, you can bind the component to any folder of exchange store, there are multiple ways to do this, I prefer the following,

RegEvent.vbs - I’ve attached the VBS file along with the download zip, this script creates the event registration for the specified folder. The following command binds the event sink to my inbox folder,

I’ve included the vbs file along with the zip file.

Exchange Explorer – this is a tool you get with Exchange SDK

Alternatively, you can build your own event registration [that’s a separate article by itself :-) ]

At last, we are done… We have created our own Managed Exchange Store Event Sink. You can also implement the Synchronous Events and the System Events as same as we have implemented the Asynchronous events.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Logu Krishnan C#, Exchange Server, Microsoft , , ,

Future of ETL - Metadata driven

March 26th, 2009

Some time back, I was called to design ETL for a mid-sized enterprise. I had to deal with various issues like multiple data sources, not-so-clean data, various data validations, data cleansing needs, ETL time window within which the ETL job shall finish, changing requirements, and the list goes on. On the top of that I had very tight timelines. Then with the experience in my previous BI projects, wherein I dealt with ETL with different tools (BO, SSIS), I came up with a list of goals, which I thought shall be there in any standard ETL implementation and must be implemented here too. Here are some of the main goals:

  • Performance
  • Flexible enough to handle changing requirements
  • Least maintenance
  • Automation
  • Modularizing processing
  • Data quality
  • Recoverability of ETL job
  • Auditing
  • Logging
  • Notification
  • Ability to reprocess rejected records, files, or any other data source
  • Remote administration of ETL job

While going though this list, I realized that all these features were there in my earlier ETL implementations too. This fact led me to a list of questions like “How I am going to improvise on my earlier designs”, “How can we take ETL to a new level”. Before I set out to find answers to these questions, I asked myself a simple question “Why”. Why shall I improvise my ETL design? What is the need of this at the first place? I found the answers early. Surprisingly the answers were “Reduced development time, with increase quality results in reduced cost for customers”. This can be a good selling point for sales team too, that I would cover in my next blog.

So coming back to original questions as how to improvise ETL design, I started with comparing different ETL tools that I have worked with in the past, looked at how these tools work, how these tools handle metadata, how they eventually run the job. Interestingly, I found that these tools are good enough to meet standard ETL requirements. But these tools do not make good use of metadata. Even if they use metadata at few places, there is no intuitive interface available. Thus I got my first lead and I decided to make use of metadata as much as possible. Hence, quest for “Metadata driven ETL” starts….

Before I tell you what Metadata driven ETL is, let me tell you “what” it is not. It does not generate the ETL program on the fly as you may expect. Instead metadata driven ETL uses an existing program that executes various tasks by dynamically reading parameters from metadata datastore. This datastore can be an xml file, a separate database repository, few tables in DW database.

WHAT IS METADATA: Any data that is used to execute a transform/operation is called metadata. For e.g. from transformation perspective, in slowly changing dimension transformation type II, the target columns that are considered as business keys is nothing else but metadata. Similarly from operations perspective, the list of database tables that must be cleared before the extraction starts, defines metadata. When we use ETL tools’ standard transforms, or hand-code ETL, we use metadata in one way or the other.

USING METADATA:
How shall we use this metadata to our advantage? The best approach to start with metadata driven ETL is to:

  1. Identify different operations in ETL that can be automated. There are certain operations that are carried out during ETL which can be automated. These operations may involve multiple tables. Usually these operations are put into some script or SQL and are executed as one operation. Let’s look at an operation where we clear clearing staging tables before the extraction starts. Usually staging tables are designed as follows:
    1. Create all staging tables in a separate database
    2. If staging area is same as data mart, create a specific schema like “STAGING” or prefixed the table names with something as “STG_”

Whichever way we go, this list of tables can be read from database system tables (metadata). Using this metadata, this operation can be automated. On the similar lines, we can create specific metadata to be used for other ETL operations. This would make the ETL development easier and faster and lesser probabilities of error. Here are the other sample operations that can be made metadata driven:

  • Moving rejected records to rejected area
  • Reprocessing rejected records
  • Data cleansing: For all incoming data, based on unique keys identified, duplicate records need to be deleted. For this, there isn’t an in-built transform in ETL tools. This can be made metadata driven. ETL would read unique keys for each incoming business entity and dynamically creates and executes it. Thus, developer would just need to specify the unique keys in metadata datastore. This again would bring down the development effort.
  • Notification: Upon completion of an ETL job, it can read metadata (from address, recipients addresses, mail server, subject, body) and this process can be made metadata driven. We all do this already. What is different here is how we implement it.
  • Auditing and logging
  • Identify transformations that can be made metadata driven: There are lots of transformations that can be made metadata driven. We need to identify those transforms and make it metadata driven. For example, let’s see the slowly changing dimension type II. This transformation is heavily used in any ETL program. In this transform, incoming records are compared for pre-defined key columns to the existing records in data mart. If the records exist in data mart they are updated, otherwise they are inserted. For every SCD II, the transform is to be coded. This transformation needs following metadata:
    • Source and target table names.
    • Business keys column names in source table.
    • Key columns in target table.

With the tools like BO or SSIS, this transform is kind of hardcoded. To make this transform metadata driven, read the above mentioned metadata and dynamically create “Merge SQL statement” (available in DB2, SQL Server 2008) in a stored procedure. This stored procedure would be called from the ETL at appropriate places, for all SCD-II transforms. This metadata driven SCD-II transform would give you the best performance. Let’s consider that the staging area is within the data warehouse database or in a separate database on the same server. If some ETL tool’s transform is used, the data is processed in batches of some pre-defined size. ETL engine would apply this transform and fire appropriate SQLs for every incoming row in the batch which is definitely slower. However in metadata driven transform which is a SQL operation, it processes all the data in bulk and we get increased performance.

IMPLEMENTING METADATA: When we use any ETL tool for transformations or for other purposes, we are in fact using metadata. The difference lies in implementation and the usage of metadata. How and where this metadata is stored? How do we access this metadata at run time? Is there any single interface available to access metadata? When we use ETL tool for some transformation, this metadata is stored in the proprietary format. We cannot simply go and change it directly. We need to use the ETL tool designer to access it. Thus, after the ETL program is deployed, if there is a change in some transform, we need to make the change through ETL tool designer and re-deploy the ETL program. So if we design the ETL in a manner wherein we use our own metadata in our own way, we would end up with a very good ETL architecture. This ETL architecture would eventually evolve into a framework. This framework can be reused across multiple ETL implementations which bring down the estimates and thus the cost significantly. This would give an added advantage to us over our competitors.

With this approach, you would get following advantages in metadata driven ETL:

  • Increase performance
  • Reduced development time
  • Lesser errors
  • Easier and least maintenance which translates into cost savings for customers in long term (added value)
  • Once the basic shell of metadata driven ETL is created, it would reduce the learning curve for the team members.

Then, I started with this mindset, and I was successful enough. Now I am in the process of refining the ETL architecture.

AT LASTL, at the same time creating metadata driven ETL is in no way suggests us to get away from the ETL tools. ETL tools have their own strengths. We would actually be architecting ETL in a different way to make it metadata driven. Metadata driven ETL pays rich dividends during maintenance phase also as it takes lesser effort and allows for quick deployment.

LOOKING FORWARD TO: Some of the above mentioned metadata is already captured in ETL tools available today, but still they need to evolve more. So until, the ETL tools support 100% metadata driven development, we as architects, shall design our ETL in a certain way to fill in the gap. I do hope commercial ETL tools vendors would be working on this and we will soon have next generation of ETL tools.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Jagdish Malani Architecture & Design, BI - Business Intelligence , , , ,

IT: Competency Building

March 26th, 2009

IT organizations were doing well until sometime back. Subprime crisis led to fall of many banks in US and before we know, we were in recession. Though nobody knows when this recession would end, but we all hope that it would end soon. Until this happens, IT organizations are facing typical problems that are often seen during recession times such as rising bench strengths, increased costs, and lower revenues. To make the matters worse, there is uncertainty about when the economy would turn around. Hoping that recession would end soon, organizations shall use this slowdown to their advantage and should prepare themselves for the good times. First thing that organizations are trying to do is to increase the employees’ utilization by effectively training employees on newer and in-demand technologies. They are reluctant to hiring at any levels. Thus organizations are focusing on competency building internally. So in the light of cut-throat competition, building competency has to be aligned with organizations strategy derived from sales planning and operations management.

Let’s see how organizations shall go with competency building . . . .

  1. IDENTIFY THE GOALS: Competency building exercise needs some retrospection before organizations take the first step. The reason being, this exercise would have been done even during good times. So organizations must examine the earlier efforts meticulously. They must find out the success rate and impact of it in various projects. Otherwise following the same approach, would yield same results. If organizations are struggling with the competency building since good times, this is an indicator that something wasn’t done right earlier. And during slowdown, it is critically important for organizations to take the right step, or they risk ending up in wrong direction altogether. Going forward, when the economy picks up, organizations that are strategic would have the edge over their competitors. Organizations must approach competency building as follows:

    1. Identify the areas in which the competency needs to be built:People who drive sales strategy (typically senior management) and people who are responsible for operations need to come together and align their goals. The mistake that few organizations do is that they run the competency building in silo. That way, they are never able to build the competency that is required for driving sales growth.
    2. Define the extent of competency building:They must work out the approximate target numbers in each level in pyramid. These numbers must be tied to the sales targets both in the short term as well as long term.
    3. Expectation Management: The next most important factor that defines the success in competency building is to have the right perception. Training employees in a new technology does not make them experts in one go; instead they get a quick and timely head start in the new area. The fact that every organization while hiring, look for a specified years of experience in a particular stream is true across all verticals and at all levels. If this wasn’t true you would have seen advertisements like “We need 5 smart people with or without prior experience for all levels”. Having said so, this does not mean competency cannot be built. It can be build if organizations define a good strategy and ensure its strategy meets the overall organizational goals. For e.g. management must provide for specific follow-up trainings and live projects experiences (even internal project would suffice) and see this exercise through.�
       
  2. ACHIEVE THE GOALS: Once the organizations figure out their goals clearly as stated above, they must ACT to achieve the specified goals. Organization can run different specific programs to achieve these goals. One way would be to run training programs in the areas in which they need to build up the competency. Another would be to use employees available on bench to develop internal tools which are required in order to improve the organizations’ efficiency internally. This shall be done by remaining focused and ensuring that these programs are aligned with their overall goals. Organizations must work to ensure following:

    1. Focused trainings: The training programs should be very much focused. Organizations must do their due diligence in doing gap analysis. They shall consider the business lines they are in and also explore this opportunity in expanding their horizon in different areas. This gap analysis can be done by:
      1. Analyze earlier projects: Organizations must spend time in analyzing earlier projects from different perspectives. They must find out what went wrong and what are the specific areas they must improve upon.
      2. Analyzing earlier sales deals: Also organizations must analyze the pre-sales deals that did not materialized. Few reasons that the sales deal did not materialize could have been like no fitting resources, no prior experience, poor estimates, high cost. Focused training would help organizations to fill in these gaps.
    2. Role based trainings: Next organizations must evaluate their employees. They must get the buy in from the employees being trained. Another thing that organizations must ensure is that these trainings are role-based. This means that if an lead level employee is trained on a new technology, organizations must figure out if the same employee would be able to play the similar lead role in that new technology. To play a specific role in any technology needs some prior experience and this becomes important for the senior roles especially. This holds true for not only during recession times but also during good times.
    3. Development of internal tools: Development of internal tools, are good for all organization in various ways such as:
      1. Building competency
      2. Manage operations effectively
      3. Evaluate a technology and
      4. Helps in sales pitch

    But before organizations jump into developing internal tools, they must take care of few things to be effective:

    1. Selection of right tools to be developed: Organizations must ensure that they are investing in developing the right set of internal tools, which are really needed. When approached, every manager or head of department would have a long list of internal tools that they would like to be developed for them. After all they are not paying for this. This would lead to a bigger mess (multiple applications using different technologies with no consolidated data) later if not managed properly. Organizations must take a holistic approach in deciding what tools to be developed, what technology to be used, and their priorities. Organizations must not use any technology just for the competency building sake. This would help them in coordinating training programs effectively with greater success.
    2. Execution model:After identifying the tool to be developed and the team that is going to work on it, organizations have to make sure the project is executed in a right model, as if it is a live project for a customer. Organizations must not take any shortcuts here and swap the roles or responsibilities within the development team executing the project. For instance, project managers must not be collecting requirements if in live projects they aren’t suppose to play that role. Right set of people shall be given the right set of roles.
  3. PERIODIC EVALUATION: Competency building efforts shall not go on for a long time without having any mechanism to measure the success. Their success must be measured periodically. This would be good for both organizations as well as employees. Organizations would be able to deploy new people effectively. If employees are not convinced within themselves, they would not be giving their 100%.

IT organiztions with an effective strategy, strong leadership and vision, would be able to build the right competency J . . .

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Jagdish Malani Project Management

WMI and .NET Performance Hiccups : Win32 – the Savior

March 26th, 2009

… or read the title as – How could you speed up your software by 90% - There is always a way out there for tuning performance… this blogs is about one such instance where I dumped WMI (Windows Management Instrumentation) and turned back to Win32 just for performance gains.

Sometime back I was involved in a project which involved lots of hardware interfaces like interacting with huge SCSI Devices, Parallel Ports, Digital Imaging et al. Though I was a fan of Win32 I was thrilled with how WMI does wonders to reduce the development time drastically primarily because of its matured API’s. I was thinking about how difficult it would be for beginner-intermediate programmers to work on system level programming using C, C++ to interact with hardware, network devices, communication devices et al… and how error prone those codes are… WMI is definitely best in this case.

And many times WMI has always saved our development time. But there were moments where we happened to hit with lots of performance hits during our product testing. While probing out the reasons, it was really surprising to see the reasons of performance hiccups. Sometimes it was development team’s oversight, sometimes poor WMI was the culprit.

Here I’ve given 2 instances which gave us a considerable amount of performance boost for my product.

Handling System.Drawing.Image Performance using C#

The Product I was working on previously had to load & Edit JPEG Images, which were of Digital Quality, which means the Image size would be greater than 3-5 MB. My customer has been always complaining about image loading speed, everything was fine while loading a 1-2 MB Image, but things started to change drastically whenever a 3 MB or greater image is loaded, it took around 1-2 sec to load a image. 2 sec is fine for normal applications, but not for people who work with thousands of images per day, the Windows Form would almost hang before loading up the image. After a considerable amount of research I found out that the real culprit, which caused the bottleneck is the line “System.Drawing.Image.FromFile()”

I happened to hit a KB article which confirms this issue. And had a hotfix[!!], which updates
1. System.Windows.Forms.dll
2. System.Design.dll
3. System.Drawing.dll

Infact there was an interesting new signatures under System.Drawing.Imaging System.Drawing.Image.FromStream(Stream stream, bool useICM, bool validateImageData) This bool validateImageData was the real cause for the image being slowed down, which validated the content of the image file before loading up. So as size of the image increased, the loading time increased exponentially.

So I had to lookout for an alternate. The obvious choice was Win32. and here is the method equivalent to Image.FromFile()

public static Image Win32ImageFromFile(string filename)
{
    filename = Path.GetFullPath(filename);
    IntPtr loadingImage = IntPtr.Zero;
 
    if (GdipLoadImageFromFile(filename, out loadingImage) != 0)
    {
        throw new Exception(”Oops! GDI Exception.”);
    }
    return (Bitmap)imageType.InvokeMember(”FromGDIplus”,BindingFlags.NonPublic | BindingFlags.Static| BindingFlags.InvokeMethod,
    null
, null, new object[] { loadingImage });
}
 

And now, when I used this new method… voila! the images started loading up atleast 90% Faster and took less than 10Millisecond to load! Wow! That was really great and amazing.Handling Performance issues in Win32_LogicalDisk using C#

Here is another instance, where I dumped WMI and used Win32 Instead.
Here is the simple WMI code which would list the removable drives of the computer.

# region “WMI Code to retrieve Drives”

ManagementClass driveClass = new ManagementClass(”Win32_LogicalDisk”);
ManagementObjectCollection drives = driveClass.GetInstances();
StringCollection driveCollection = new StringCollection();
try
{
   
foreach (ManagementObject drv in drives)
{
//Check is made to find whether the drive is from removable storage device
if ((drv[”Description”].ToString()==”Removable Disk”) && (drv[”DriveType”].ToString()==”2″))
{
    driveCollection.Add(drv[”Caption”].ToString());
}
} }
# endregion 
This code would take a minimum of 4-5 seconds to enumerate my disk drives. And another problem is that, every time the floppy drive is also physically checked[but… why!!], which further slows down the execution time. None of our clients would accept this, when this feature is used frequently. There was no way to solve this issue except to lookout for a Win32 Method, and here is the alternative… Using Win32

# region “WIN32 Code to retrieve Drives”
[System.Runtime.InteropServices.DllImport(”kernel32.dll”, SetLastError=true)]static extern uint GetDriveType(string lpRootPathName);
/* Retrieves All the Mounted Drives on the computer. */
string[] _drives = System.Environment.GetLogicalDrives();

foreach(string _drive in _drives)
{
    /* Call Win32 GetDriveType to determine the Drive Type,based on the Drive Letter */

//Check whether the passed Drive is a Removable Disk Type     
_driveTypeLength
= GetDriveType(_drive);    
if
(_driveTypeLength == 2 || _driveTypeLength == 5)
    {
         driveCollection.Add(_drive);
    }
}
# endregion

This code executed in less than 100 MilliSeconds !!! That was an incredible performance boost.
 Do you think using Win32 as an alternative is insane? Have you faced such realtime problems? Would you still use WMI? Talk back!

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Logu Krishnan C#, Performance , , , ,

Can Project Manager be a Role Model?

March 22nd, 2009

Let me start this blog by asking a few questions:

  • Do you think the Project Manager is the role model for all the team members?
    • How often does this often?
  • Is PM really respected from his/her team members?

Before I begin with opportunities of PM (Project Manager) to become a role model; let’s quickly understand what this role truly demands. This is important because there is a myth and wrong belief among various project team members that PM’s are mere coordinator and are usually there to police around them and take action. There might be some instances where Managers leave such notion.

So, who is a Project Manager?
A person who takes ultimate responsibility and guarantees for the desired result to be achieved on time and within budget is the Project Manager. A PM has an overall responsibility for the successful planning, execution, monitoring, control and closure of a project.

Well, in this blog, I want to highlight how easily a PM can get confused between the process and the goal. In such scenario, PM’s usually gets inclined towards quantifying things that does not add any value. Not sure of what else to do, they tend to occupy their time with less important activities such as metrics, spreadsheets or reports. This makes team members belief that PM is not adding any value. Team members can very easily carry these thought of their PM’s in all of their future projects. By definition, ‘project is a progressive elaboration’ and PM getting stuck with such non value added activities increases the gap between project and the PM.  For the PM, he/she is in false belief that if the project team just pursues the processes to perfection and follow the checklists; they are bound to succeed in the project.

A good Project Manager does not carried away with such stringent web of procedures; instead they are flexible enough to tweak the processes according to the project needs. A PM should always keep an eye on the business goal that is achieved by accomplishing a set of tasks/work and by a bunch of people (team members). To overcome the confidence and respect of the team members, PM has to educate the team members on various roles of the project team members including their own roles. As project progresses, PM has to measure each roles with their set objectives and not only provide feedback but also show the direction to conquer the gray areas. It’s important to ensure Managers spends enough time with each individual in the team and help them in cultivating a desire for achievement. Getting the best possible performance from each team member is the responsibility of the manager.

In a nutshell, PM must get involved in the project by understanding not only the business goal but all the functionalities of the requirement, by understanding the high level design architecture and most importantly by expediting the overall process of quality development by timely removal of obstacles and by providing directions that lead to right solutions at right time.

Good PM or leaders can thus earn special kind of respect from their team members (developers, testers, architects…). PM should be able to enable: act of thinking, strategy and leadership that positively impact the team. I am sure all of you would now agree that PM can easily be a role model which indeed also depends on his/her personality.

Please feel free to post your comments….

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Shrinath Inamdar Project Management , ,

Mobile Everywhere

March 7th, 2009

Mobile Application Architecture Series – Design Considerations

Mobile applications inherit various challenges which need to be considered while designing on any platform (Microsoft / Symbian / OS X etc). The following section tries to address various design guidelines for a mobile application development. Technology stack considerations would be the next in this series of blogs.

A mobile application will normally be a multi-layered application comprising user experience, business and data layers. When developing a mobile application, you may choose to develop a Thin-Client or Rich-Client. If you are building a rich client, the business and data layers will be located on the device itself. In Thin-Client, the business and data layers will be located on the server.

Design Considerations:

Follow these design guidelines to ensure that the application meets your requirements and performs efficiently in mobile world.

1)      Thin-Client or Rich-Client or RIA.  If your application requires local processing and must work in an occasionally connected scenario, consider designing a Smart (thin) client. But, a Rich-Client application will be more complex to install and maintain. If the application can depend on server processing and will always be fully connected, consider designing a Thin-Client. If your application has a rich user interface, only limited access to local resources, and portability to other platforms is required, then RIA would be a good choice (eg. ETrade).

2)      Device Types to Support: While choosing which device types to support, consider screen size, screen resolution (DPI), CPU performance characteristics, available memory and storage space (inbuilt & external) and Integrated Development Tool availability. On top of the above, we need to consider the User Requirements and Organizational Constraints / Guidelines. Any additional components such as GPS and Integrated Camera may influence your choice of application type and device type.

3)      Connectivity: The real time connectivity requirement between the mobile device and Gateway would highly influence the decision of whether to go for Thin-Client or Rich-Client or RIA. If an application requires an intermittent network connection, our design approach should properly handle caching, state management and data access mechanisms.

4)      UI design considerations: Mobile devices requires simple UI in order to work within the constraints imposed by the device hardware such as memory, battery life, different screen sizes and their orientations and network bandwidth.

5)      Layered Architecture: Depending on the application type, multiple layers may be located on the device. Use the concept of Layers to maximize the separation of concerns, and to improve reuse and maintainability for your mobile application. However, aim to achieve the smallest footprint on the device by simplifying your design compared to a desktop or web application.

6)      Security: Designing an effective authentication and authorization strategy is important for the security and reliability of your application. Mobile devices are usually designed to be single-user devices and normally lack basic user profile and security tracking beyond just a single password. Mobile applications can also be especially challenging due to connectivity interruptions.

7)      Caching: Use caching to improve the performance and responsiveness of your application, and to support operation when there is no network connection. Use caching to optimize reference data lookups, and to avoid network round trips. When deciding what data to cache, consider the limited resources of the device; you will have less memory available than a PC.

8)       Communication: Device communication includes wireless communication (over the air) and wired communication with a host PC, as well as more specialized communication such as Bluetooth or Infrared Data Association (IDA).  When communicating over the air, consider data security to protect sensitive data from theft or tampering.

9)      Performance Considerations:

a.       Design configurable options to allow the maximum use of device capabilities. Allow users to turn off features they do not require in order to save power.

b.      To optimize for device resource constraints, consider using lazy initialization.

c.       Optimize the application to use the minimum amount of memory. When memory is low, the system may release cached intermediate language (IL) code to reduce its own memory footprint, return to interpreted mode, and thus slow overall execution.

d.      Consider using programming shortcuts as opposed to following pure programming best practices that can inflate the code size and memory consumption. This decision should be a thoughtful one as it may contradict the design principles of OOPS and maintainability.

e.      Consider power consumption when using the device CPU, wireless communication, or screen while on battery power. We should balance power consumption with performance.

In my view, the above listed considerations are only a subset of complete list, but tried to address the key areas in it. One basic design approach for most of the mobile applications is “avoid BDUF”, Big Design Up Front, which recommends design evolving over time and avoid making a large design effort prematurely.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Sreenivasa Rao Pilla Architecture & Design, Mobile Architectures , , ,

ETL Design Pattern : E-LT-L

March 6th, 2009

I was ramping up C# to create BI Framework and I hit upon the term “Design Patterns”. I decided to go through few patterns only, as I did not have patience to complete the book, also I assumed I don’t need it. Finally at the end of day, I was able to implement a couple of patterns in my BI Framework application. During this time, I wondered whether there are any pre-defined design patterns in Data Warehouses, ETL, Cubes, and Universes. My quest begins … In this blog, I will focus on ETL alone. 

I started looking back at my projects to analyze if I had used any design patterns or whether I hit upon the recurring problems that could have been solved with alternate designs. What I found was quite interesting. I have been using best practices defined for each component all along. Then why were we slogging all the time. What were the problems we were facing every day? Now the right time! I decided to plunge deep in each component to find out exactly what I could have done differently during design which could have made my life easier. And that is where I came up with an ETL design pattern E-LT-L.  

Before I explain what this E-LT-L means, let’s look at ETL first. ETL stands for Extraction, Transformation and Loading. In ETL we extract data from multiple data sources and transform the incoming data in a format compatible with data warehouse structure, followed by loading into data marts. Usually companies use some commercial tools like SSIS, BO Data Integrator Designer, etc… The developers then make full use of these tools and they end up using most of the functionality provided in these tools. And to an extent, this looks right too, as the companies have made an investment for this purpose.  

Finally when companies, evaluate their ROI, the results are amazing. By following industry standards and using commercial ETL tools, training their development teams, the results do not look good. A new set of problems like performance problems, steep learning curves, fixing the parts that are not broken, buying additional hardware, etc. have come up.  From ETL perspective, even though, most companies have a designated job server, they do not get a good performance. After a while in production, when there data marts grow in size or the size of incoming data increases, the performance of ETL job takes a hit. To resolve these issues, many things are done like increasing configuration of job server, make changes in database structure, use bulk loading options (favorite choice for techies), split the jobs, pushing few things like summarization over  to weekends, etc.. This results in making the system more complex than it should be which has a direct hit on overall IT cost.    

So exactly what went wrong?  

This is where a new pattern E-LT-L comes into existence. Most of the recurring problems can be resolved using this design pattern.  

E-LT-L stands for Extraction, Loading and Transformation, and Loading. This basically suggests that, once the data is extracted,  instead of applying transformations (T) in the staging area, load this data into data mart; and then apply transformations there (LT).  Since, incoming (raw) data is in data mart already, it would make more sense to use database objects (stored procedures) for transformations instead of row-based transformations available in ETL tools. Using database based transformations would resolve most of the problems. 

Effectively, this pattern calls for BULK loading and transformations at correct place without moving huge amount of data around.  

This may sound strange and many people would agree to debate. To prove the point, I would present few scenarios and let you decide what is good for your implementation.  

Scenario 1: Let’s assume you have 1 million incoming records that need to do look up for say customers. No matter what tool you use and how much you configure it, it has to run some sql against customer master, which is typically huge in may DW installations, to get the customer code. This will become a bottleneck as the sql would be executed multiple times. Also to add to the woes, the “customer code” column value has to travel from database server to job server and is stored in the placeholder (variable) in incoming row. Instead if you code the lookup transformations, in stored procedure, with one sql you can update all the rows by simply joining the staging table with customer master table. By all counts, the performance of this sql cannot be beaten.  

Scenario 2: We have incoming product master for a large retail chain. We need to implement slowly changing dimension type 2 here (insert new records and update existing records if already present. Any ETL tool would implement this transformation row by row and it can get painfully slow. This transform can be easily developed in stored procedure using MERGE sql statement. For more experienced developers this can be made meta data driven.  

To summarize, the transformations done using stored procedures help as follows:

  • It does not move data between database servers and job servers. We get a good boost in performance.
  • The data is processed in bulk.
  • Database engine (Oracle,SQL Server) is designed to do this job in the most efficient manner than any third party ETL tool can do.
  • It is always easy to fix/debug the stored procedure than some transform in ETL tool.
  • Small learning curve for developers to ramp up on the ETL tools.
  • Deployment of ETL packages (which is surely a problem in ETL tools). It takes a lot of effort.

I am sure, many of you would be thinking then why we shall use any commercial tool for ETL at all. These tools have their own significance. We would need a commercial ETL tool to achieve other goals of ETL, such as:

  • Defining a workflow where we decide upon what tasks shall execute and in what order. What to do if some task fails.
  • Executing tasks in parallel. 
  • Even though a stored procedure is created to transform some data, it needs to be called from an appropriate place in ETL.
  • Use the logging/notifications capabilities of ETL tool, which usually are very efficient and simple. 
  • Use the scheduling and other features readily available in commercial ETL tools 

What this E-LT-L pattern suggests is the design of ETL architecture. By having good ETL tool and good design of ETL, we can write efficient and manageable jobs. 

If you have any questions or comments, you can reach me at jagdishm@aditi.com.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Jagdish Malani Architecture & Design, BI - Business Intelligence, Performance ,

Effective Project Management - Setting Right Expectation at Right Time

March 6th, 2009

With my experience as a Project Manager, through this article, I am trying to collate what are typical best practices of project management I have learnt and what I practice in real time. This may not be the proven OR standard process but this is what I ended up doing to ensure project success in one of my recent project.

With Agile software development – one tends to associate a keyword called ‘adhoc’. But the fact is whatever the type of project is: fixed, waterfall, spiral – the agility exists and one needs to adapt and act smartly.

Recently, I accomplished a fixed priced project of 3 months duration that had its high level requirements already stated. But believe me this was as agile as any other project: with requirement changing until the end. This project had all the cycles/phases of project management right from Initiation, Planning, Execution, Monitoring & Control and Closure. These are typically called as Process groups.

How can one assure that every project they execute is successful? What are these key project management skills that ensure its success?

Let’s discuss on one of the key concept that makes project succeed or fail. Surprised! How this is just one area: well as per my experience and knowledge on project management this is the key:

Setting right expectation to all the stakeholders at each phase of the project and at right time”

What does this mean?  Whether it is various knowledge areas or process groups: It is all about setting right expectation at right time with respective stakeholder.

Following are the key areas as per my priority list where right expectation needs to be set:

Expectation 1:  Project Goals – This typically covers all the processes, its input and output. So key point is: with what quality you want to achieve this, can you define a metric around each objective/goal that need to be achieved?

Expectation 2: Team member’s objective setting – These project metrics or goals can be achieved if one ties these to the objective settings for each team member (consisting of developer, testers…) in the project.

Once these are set – the next step is to track, monitor and control these. This can be done by measuring the metrics at frequent intervals depending upon the milestones and length of the project. Based on the information, one needs to take necessary corrective and preventive action. So, the end result would be you would re-prioritize and reset/rework on the objective setting. Some other key areas where this concept must be applied are:

Expectation 3: Setting Customer expectation – Whether it is timely status reporting or getting right information from Client: it again boils down to setting expectation with Client. This should have supporting data, reviews and feedback to ensure all stakeholders have a buy-in.

Expectation 4: Checking project progress at regular interval (at milestones) and re-setting expectation – This is the key piece to ensure that what was initially defined in scope is same as what is achieved. Getting this validated at frequent interval prevents last minute surprises and project failures.

I hope in this article, the message is very clear: irrespective of the type of the project in this agile nature; “define simple measurable steps, apply, measure and control”. You will find various jargons to illustrate this same principle: one of them being PDCA, also known as Deming cycle.

PLAN Design or revise business process objectives to improve results
DO Implement the plan and measure its performance
CHECK Assess the measurements and report the results to decision makers
ACT Decide upon the changes required to improve the overall process and implement the corrective and preventive action

I hope this article has given some basic insight on ‘effective project management’ and information on ‘how to apply simple basic rules and ensure project’s successes

Please post back with your thoughts and your own experience. We can debate on them.

Written by,
Shrinath Inamdar

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Shrinath Inamdar Project Management

Back to Basics: Performance Killer Code – Unaligned Memory in 32-Bit for C# Struct

March 5th, 2009

Recently I was analyzing a .NET Application for performance which had lots of structs defined in it, and happened to hit a strange reality. Unaligned Memory problem! I was running a profiler, and found that the memory allocated for few structs are huge than it should normally allocate (based on my own math). When I probed further, there was an interesting discovery. Read on…

Let’s get back to basics…
Alright here is a little head spinner… What is the difference between the following structures?


struct BadStructure
{
char c1;
int i;
char c2;
}struct GoodStructure
{
int i;
char c1;
char c2;
}

Nothing much, except the jumbled type declarations… Huh?

Fine, Now let’s look at the size of these structures,

The size of BadStructure Structure in:
.NET Framework 3.5 : Managed sizeof= 12 Bytes, Marshal.Sizeof = 12 Bytes
The size of GoodStructure Structure in:
.NET Framework 3.5 : Managed sizeof= 8 Bytes, Marshal.Sizeof = 8 Bytes
[Note: Size of int=4, char=2]
The Reason behind these differences is “BYTE ALIGNMENT”, As with the default packing in unmanaged C++, integers are laid out on four-byte boundaries, so while the first
character uses two bytes (a char in managed code is a Unicode character, thus occupying two bytes), the integer moves up to the next 4-byte boundary, and the second character uses the subsequent 2 bytes. The resulting structure is 12 bytes when measured with Marshal.SizeOf.32 bit microprocessors typically organize memory as shown below.
                  Byte0  Byte1  Byte2 Byte3
0×1000
0×1004     A0        A1        A2      A3
0×1008
0×100C                 B0         B1      B2
0×1010     B3

Most of the processer architectures cannot read data from odd addresses.
Processor Architectures are inefficient in reading the data if it starts at an address not divisible by four.

Memory is accessed by performing 32 bit bus cycles. 32 bit bus cycles can however be performed at addresses that are divisible by 4. So for efficiency purposes, compilers add the so-called pad bytes. The reasons for not permitting misaligned long word reads and writes are not difficult to see. For example, an aligned long word A would be written as A0, A1, A2 and A3.

Thus the microprocessor can read the complete long word in a single bus cycle. If the same microprocessor now attempts to access a long word at address 0×100D, it will have to read bytes B0, B1, B2 and B3. Notice that this read cannot be performed in a single 32 bit bus cycle. The microprocessor will have to issue two different reads at address 0×100C and 0×1010 to read the complete long word. Thus it takes twice the time to read a misaligned long word.

The following byte padding rules will generally work with most 32 bit processor.

a. single byte numbers can be aligned at any address
b. Two byte numbers should be aligned to a two byte boundary
c. Four byte numbers should be aligned to a four byte boundary

This is the cause of the difference.

Fine…. How do we fix this ?

The .NET compilers apply a StructLayoutAttribute to structures, specifying a Sequential layout. This means that the fields are laid out in the type according to their order in the source file.

Here is the IL for Bad Structure.

.class nested private sequential ansi sealed beforefieldinit BadStructure extends [mscorlib]System.ValueType
{
.field public char c1
.field public char c2
.field public int32 i
}
In the .NET Framework 3.5, the JIT does enforce a Sequential layout (if specified) for the managed layout of value types,We can use the System.Runtime.InteropServices namespace and the StructLayoutAttribute class to control the physical layout of the data fields in the Microsoft .NET Framework 3.5. So Fix is to specify [StructLayout(LayoutKind.Sequential, Pack = 1)] for the struct.Watchout for structures when you create them next time, and think about playing around with ‘m’ structures with ‘n’ size…. m x n = !!! You can definitely save few Kilo Bytes of load or worst case if you are using structs heavily for Data transformation you might even save few Mega bytes. Alright, Time to re-factor your code now :)

Happy Coding!

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • description
  • LinkedIn
  • Live
  • MySpace
  • Slashdot
  • Technorati
  • TwitThis
  • description
  • E-mail this story to a friend!
  • Print this article!

Logu Krishnan C#, Performance , , ,