Vivek 的个人资料A Developer's Experience照片日志列表更多 工具 帮助

The Constraint Of Constraints !!!

I was happy programming with (generic) constraints which overtook the limitations in C++ templates. However, constraints need to be less restrictive. What is that? Let us say I want a helper method that checks a condition and throws an exception if the check failed. So that would be...

void TestCondition(bool condition, string msgText)
{
    if (!condition)
    {
        throw new Exception(msgText ?? "Unknown Error");
    }
}

I like it; it is cute. However, not at all times would the caller be happy with throwing System.Exception. Since the above method can be used in a wide variety of situations, it would make sense for the caller to specify the type of exception he would like to throw. It would be classic if the above is made generic taking type T and constraining it be to a derivate of System.Exception. Sounds great and so…

void TestCondition2<T>(bool condition, string msgText) where T : Exception
{
    if (!condition)
    {
        throw new T(msgText ?? "Unknown Error");
    }
}

The above code is dream still; it will result in a compilation error. Although generics offers the constraint that T should be a derivate of System.Exception, it does not relax it specifying the constructor it would like to use from T; in other words, there is constraint mandating T to expose a default constructor but none other. So we cannot offer the above facility with elegance (or performance?). There is the ugly way. How?

void TestCondition3(bool condition, Exception ex)
{
    if (!condition)
    {
        throw ex;
    }
}

Seems good? Partly No. If TestCondition2 was possible, the exception object(s) would be created only if the condition failed, while TestCondition3 meeting our purpose creates exception objects irrespective of the condition; and gets thrown if the condition fails.

So when are (generic) constraints going to be relaxed? C# 5.0?

System.Object – Reference or Value Type?

I was explaining about .NET types to this new guy in our team; new to the team and to .NET. I was explaining about reference and value types, allocation etc. And he asked, "Is System.Object reference type or value type?". Nice question, right? Let me add its corollary - Is System.ValueType a reference or value type?

CLR is the platform for loading types and executing code. All types in .NET are derivatives of System.Object. When we say System.Array is a reference type or System.Drawing.Point is a value type, it is an indication for the CLR to choose the allocation (heap or stack) and transport strategy for the Type. What I mean by transport strategy here is what gets copied on transport or assignment – the Type’s value(s) or its address.

For instance, what happens we write SomeType obj1 = obj2 or return SomeType():-

  • If SomeType is a reference type: It is the reference (or address to pointing to the object pointed to on the heap) that is copied to obj1. No new object is created. obj1 and obj2 point to the same object (on the heap). To create a new object (on the heap) based out of obj2, I must Clone().
  • If SomeType is a value type: It is the entire value of the obj2 that is copied to obj1 which is entirely a different object. So a new object (obj1) is created (on the stack) and a member wise copy is made from obj2 to obj1 where again the same rules apply.

I consider the transport strategy more important than where it is allocated. CLR could have all the locations entirely on the heap but still differentiate the Types based on the transport strategy. CLR chose to differentiate the types using both. CLR allocates a reference type on the heap and deals with such objects via references. A value type is allocated on the stack and always transports the entire content of the Type (what is considered its value).

So we as Type writers, indicate the allocation and transport strategy to the CLR by deriving from predefined Types. A Type derived from System.ValueType is (obviously) a value type (Stack\Value). A Type that is NOT derived from System.ValueType is a reference type (Heap\Reference). This is the model chosen by the CLR for dealing with any Type under the System.Object and System.ValueType in the object hierarchy. Nevertheless, all Types are derived from System.Object.

Now let us answer the real question. The Type's type is an indication for the CLR to determine how to treat the object. So that is for a Type I write. But if I create an (direct) instance of System.Object, how is it treated – allocation and transport strategies? For that matter, where would an instance of System.ValueType be allocated?

The quick answer is System.Object is a reference type, primarily because the CLR is a system driven by garbage collection. Besides, CLR does not require a special mechanism for reclaiming the objects allocated on the stack; a mere stack unwind is enough.

// o is allocated on the heap
System.Object o = new System.Object;

// Compilation Error! System.ValueType is an abstract class.
System.ValueType v = new System.ValueType();

I hope you see the end of chicken and egg story. Making System.ValueType abstract while allowing to create direct instances of System.Object is lot like the theory of God being the source of all creations. He (System.Object) is the beginning and end of everything.

Answering Ian - C++ Temporaries !!!

First, let us welcome Ian Halliday. Next let us answer his question.

Foo() returns a List object. But the List::Iterator iter(Foo()) declaration uses a reference to a temporary object, which would get obsolete even before the first iteration is complete. Typical C++ gotcha!

I have been bit hard with kind of stuff in my project a few times. Sanjeev and I have spent long hours thinking of devising a mechanism with which we can find out if a given C++ object is a temporary (actually we needed to prevent creating temporary objects of our class). Unfortunately our little brains could not find a solution.

So how do we fix Ian's problem? Unfortunate and simple - create a temporary only when you know you need it.


List lObject = Foo();
for (List::Iterator iter(lObject); !iter.AtEnd(); iter.MoveNext())
{
    DoStuff(iter.Current());
}


And hey make sure that List class is smart enough of making deep copies properly; else the above code will blow up again. Now let us wait to hear from Ian.

Mixing Design Patterns - Decorator Vs Template Method

Sanjeev and I have posted an article at CodeProjectTemplate Method Vs Decorator. Not many programmers realize the effect of mixing design patterns in use. This article is one such instance involving the Decorator and Template Method pattern. Hope you would enjoy the article. Read it. Rate it.

Article Excerpt:

This article is a result of the serendipity experienced when implementing a small framework. The framework had a bunch of classes, which implemented some of the widely used design patterns. One of the classes implemented the Template Method pattern. The situation then was to make that class extensible without modifying it. The Decorator pattern seemed the right fit to extend (decorate) the class.

Besides providing a level of sophistication in code and the fun in its use, design patterns do not reveal until asked for the problems in mixing them. This article explains the problems when mixing Decorator and Template Method patterns. The article also discusses the possible solutions to circumvent the problem in mixing the above design patterns. Know why these patterns don't gel well? Please read on.

Hot and interesting discussion is going on about the article. Don't miss it. And be sure to share your comments and thoughts.


Curious Case Of Anonymous Delegates !!!

Senthil has left us thrilled in his new post, and also inspired me to write about the topic. Although, anonymous delegates have become a mundane stuff amongst programmers, there is still these subtle stuff left unexplored. Alright, let us try to answer Senthil's question before he unravels the mystery in his next post.

A delegate is identified by its target. The target is the method to be executed on a delegate invocation and its associated instance or type. If the target is an instance method, the delegate preserves the target method pointer and the object instance. If the target is a static method, the delegate preserves the target method pointer and the type to which it belongs. So when a code like the one below to register a method to an event (or multicast delegate) is executed, a delegate object (EventHandler here) with the target information embedded is created and added to the invocation list of the event (or multicast delegate, KeyPressed here).

class SomeForm
{
	private Control control = new Control();

	public void OnFormLoad(object sender, EventArgs args)
	{
		control.KeyPressed += new EventHandler(OnKeyPressed);
	}
	
	// Rest of the code omitted to be succinct
};

Likewise, when unregistering the method handler, a new (EventHandler) delegate object is created with the same target information as above. As said earlier, a delegate is identified by its target. In other words, the Equals override on the delegate uses the target information for comparing two delegate objects. Hence in the following code that unregisters the method handler, the invocation list is searched for a delegate instance with the specified target information (Method: OnKeyPressed, Instance: SomeForm instance).

In the case of anonymous delegates, the compiler transforms the inline method code into a

  • static method, if the inline method code does not use any of the class's instance members or local variables or if it uses only the static members of the class.
  • instance method, if the inline method code uses at least one class member, any or no static members, and no local variables.
  • class with a method that represents the inline method code, if the inline method code uses local variables no matter whether it uses the class members or not.

Those might not be the extensive set of rules but sure are enough for our discussion. Given the following questionable code,

public EventHandler IfEnabledThenDo(EventHandler actualAction)
{
	return (sender, args) => { if (args.Control.Enabled) { actualAction(sender, args); } };
}

public void Initialize()
{
	control.KeyPressed += IfEnabledThenDo(control_KeyPressed);
}

public void Destroy()
{
	control.KeyPressed -= IfEnabledThenDo(control_KeyPressed);
}

we realize, without doubt, that the anonymous delegate (returned by IfEnabledThenDo) would be transported into a compiler generated anonymous class. Later when IfEnabledThenDo is called for registering\unregistering the method handler, an instance of anonymous class is created and the (EventHandler<Control.ControlEventArgs>) delegate is returned. And here lies the subtlety. Although the delegate from IfEnabledThenDo targets the method inside the anonymous class, the instance preserved as a part of the target information are different during registration and un-registration. In other words, the target method of the delegate returned by IfEnabledThenDo belong to different instances of the anonymous class. Hence the pretty code to unregister the (key pressed) method handler would not be actually unregistering since there would be a delegate previously registered in the invocation list of the (KeyPressed) event with the target instance same as the one used in the unregistration line of code. Very subtle!

Usually the hand written code tends to keep the registration and unregistration of the method handlers in the same class and so belong to the respective instances. Not so when you like watching the compiler magic.

Let us wait and see what Senthil says.

(Re)declaring event Members !!!

Sanjeev wondered why should an event decalred in an interface be declared in the implementation class too. For instance if we have an interface ISample as follows:-

public delegate void DataChangedDelegate(int dataID, object newData);
public interface ISample
{
    event DataChangedDelegate dcDelegate;
    void SomeMethod(); 
    bool SomeMethod(int x);
}

and a class that implements ISample (implicitly):-

public class SampleImpl : ISample
{
    public SampleImpl()
    {
    }

    #region ISample implicit implementation

    // Why is this required to be done?
    public event DataChangedDelegate dcDelegate;

    public void SomeMethod()
    {
    }

    public bool SomeMethod(int x)
    {
        return x >= 0 ? true : false;
    }

    #endregion
}

While we define the implementation of the methods from ISample in SampleImpl, we dont do anything (special) to implement the event. So why should SampleImpl simply declare the event once again? Of course, it is convince to compiler from spitting errors.

It is time to know what is an event? The event keyword when decorated for a delegate declaration ensures that only the class\struct in which the event is declared will be able to raise event, irrespective of its access levl. In other words, only the owning class\struct will be able to call the subscribers. So that will prevent anybody outside the declaring class to misuse the delegate - raise false events - especially when the delegate is declared public.

Good, but still there is a way to tamper the delegate. Anybody could add or especially remove the subscribers from the delegate (when it is exposed public). The event keyword does more than preventing unauthorised use of the delegate to raise events. It is a compiler acronymn for methods by which handler targets can be added and removed from the delegate. We can say it is analogous to the properties which provide get and set methods. An event is not like an ordinary member declaration - int, float or delegate for that matter, and interfaces cannot bear member declarations.

In essence, an event declaration in the interface opens up during compilation as follows:-

// add and remove are like get\set in a property.
event DataChangedDelegate dcDelegate { add; remove; } 

The add and remove have to implemented in the derived class. Declaring the event in the derived class instructs the compiler to generate default implementation for add and remove methods, something like what the automatic properties do. If one wants to take control of adding and removing handlers with some custom logic, he/she can do so by explicitly implementing add\remove methods.

private DataChangedDelegate dcSubscribers;
public event DataChangedDelegate dcDelegate
{
    add
    {
        if (value == null)
        {
            throw new Exception("Specified null event handler");
        }

        lock (dcSubscribers)
        {
            dcSubscribers += value;
        }
    }
    remove
    {
        if (value == null)
        {
            throw new Exception("Specified null event handler");
        }

        lock (dcSubscribers)
        {
            dcSubscribers -= value;
        }
    }
}

And SomeOtherMethod in the derived class could fire the event.

public bool SomeOtherMethod(int x)
{
    DataChangedDelegate tempDC = dcSubscribers;
    if (tempDC  != null)
    {
        tempDC(x, DateTime.Now);
    }

    return x > 0 ? true : false;
}

So that's why it is necessary to declare an event member when deriving from an interface.

finally and Return Values !!!

Let us read some code:-

int SomeMethod()
{
    int num = 1;

    try
    {
        num = 5;
        return num;
    }
    finally
    {
        num += 5;
    }
}

What is the return value of SomeMethod? Some anonymous guy asked that question in the code project forum, and it has been answered. I am writing about it here because it is interesting and subtle. One should not be surprised when people misinterpret finally. So let us take a guess, 10 (i = 5, then incremented by 5 in the finally block).

It is not the right answer; rather SomeMethod returns 5. Agreed that finally is called in all cases of returning from SomeMethod but the return value is calculated when it is time to return from SomeMethod, normally or abnormally. The subtlety lies not in the way finally is executed but in the return value is calculated. So the return value (5) is decided when a return is encountered in the try block. The finally is just called for cleanup; and the num modified there is local to SomeMethod. So make the return value 10, it is no use being hasty making SomeMethod return from the finally block. Because returning from finally is not allowed. (We will talk about it later why returning from catch block is a bad practice and why can't we return from finally block).

Had such modifications been done on a reference type, they would have been visible outside of SomeMethod, although the return value may be different. For instance,

class Num
{
    public int _num = 0;
};

int SomeMethod()
{
    Num num = new Num();

    try
    {
        num._num = 5;
        return num._num;
    }
    finally
    {
        num._num += 5;
    }
}

So in the above case, the return value is still 5, but the Num._num would have been incremented to 10 when SomeMethod returns. So reflecting shows that our code is transformed as follows by the compiler, where the CS$1$0000 is our return value.

private static int SomeMethod(Num num)
{
    int CS$1$0000;
    try
    {
        num._num = 5;
        CS$1$0000 = num._num;
    }
    finally
    {
        num._num += 5;
    }
    return CS$1$0000;
}

Given that we have clarified ourselves about finally, we should be writing the code as transformed by the compiler because returning from try and catch blocks is not a good practice.

BSTR Traffic !!!

My friend, who is into programming not long now, she has been using .NET string and fancies that. Lately she had to write COM based test applications and BSTRs turned out to be a nightmare. Then she called me up saying that she found a questionnaire online that reflected her queries about BSTRs; and answering those would make her a BSTR expert. This post answers them. I believe it would be of use to all.

What happens with a BSTR when it is passed to a COM function call:-

Case 1: HRESULT _stdcall MyFunction([in] BSTR bstr);

§ After using bstr, must the client free the BSTR (by calling SysFreeString() )?

Ans: Yes. From the prototype of the above call, we can see that the client is allocating the BSTR (SysAllocString) and passing it to the server. The server gets a copy of the string and it takes of releasing the copy. The client is responsible for freeing the string (SysFreeString) that it allocated.

Case 2: HRESULT _stdcall MyFunction([in, out] BSTR* pbstr);

Before answering this, let me clear the ambiguity here with term ‘pbstr value’? Does it mean the address value that the BSTR (wchar_t*) points to or the string data at that address? Since the BSTR in this case is an [in, out] parameter, the server can change the string data or allocate a new string (at a new address) altogether. And it is reflected at the client side. In our context, let us assume ‘pbstr value’ refers to the address; and ‘pbstr’ refers to the string data.

§ After using pbstr, can the server change the pbstr value (as an out parameter)?

Ans: Yes. Since this is an [in, out] parameter, the string from the client side is transported to the server side, and the server can change the string value (by calling SysReallocString). I believe that inherently changes the pbstr value.

§ Should the client free the BSTR by SysFreeString() to recreate one with another value ?

Ans: Technically yes. Although the pbstr can indexed within the limits of the BSTR length and individual characters can be changed, I could not imagine why somebody would want to do something like that. An alternative better approach instead of [in, out] would be to take [in] a BSTR and give [out] the modified string (could be the same too). The direct answer to the question is it is not necessary definitely call SysFreeString (and SysAllocString with the new string value); it can be done in one shot using SysReallocString.

§ What are the steps to initialize the pbstr to another value?

Ans: The question is little ambiguous. One would initialize something to an initial value, and later assign other values. Former is accomplished in the case of pbstr with SysAllocString, and latter with SysReallocString.

Case 3: HRESULT _stdcall MyFunction([out, retval] BSTR* pbstr);

§ How do I pass an empty BSTR string?

Ans: SysAllocString(L””); I haven’t tried SysAllocStringLen with 0 length. Not sure if it would work.

§ When the function receives a pointer to a BSTR, is it ok if the server leaves the pointer untouched?

Ans: The API must be documented about the values that the server would return – NULL or empty string. But if there was also [out] BOOL parameter that tells the client if the call succeeded and the client may access the pbstr only if the BOOL parameter is TRUE, then the server can leave pbstr untouched. Generally speaking it is not a disciplined way. In practice, the server checks if pbstr is NULL before proceeding with the call.

§ What is the value pointed by the pointer?

Ans: Again this is very ambiguous but let me answer something else. A BSTR is a nothing but a pointer to a four byte length prefixed wchar_t array of characters. So a BSTR is wchar_t*, and it points to the first character of the string instead of to the length.

If you are using C++ for developing ATL-COM applications, I am sure you would not be dealing directly with the above mentioned discrete APIs. You should be or get smart using CComBSTR which is a smart wrapper over the BSTR APIs. CComBSTR takes care of releasing the string when it goes out of scope. But that is sometimes hazardous when you pass the string around to the server or return from the server, whereby you must be working on a copy. Anyways, gaining that sort of knowledge and experience is inevitable in COM/C++. So when you are ready, you will see how CComBSTR or CComPtr saves you from getting hurt.

Problems & Solutions - Ours, Loki and boost !!!

Our article 'Type Safe Logger For C++' we recently posted at CodeProject received a decent response. And halex2005 has commented:-

Its seems very similar to boost::format and Printf() from Loki library by Andrey Alexandrescu with supplement. But its have drawbacks - using std::string anywhere is inefficient. Why not to have plain char pointers in format string parameter instead? Have you seen specified libraries before development?

Honestly, we were not aware neither of a library named Loki nor a similar facility offered by boost library. I am not going to justify more than that. But we did took a rough glance after halex2005 commented. The boost library's technique is all % and it does not truly align with C++ syntax ideologies. Loki's library obviously seems to do more stuff than ours. It is and works complex. Besides the base premises seem to be different, although it offers the same functionality.

On the same lines, we recently implemented a technique to mimic strongly typed enumerations in C++(98). Although it is not possible to implement a very clean solution, we were shocked to see that somebody had implemented almost the same way we did. We came across it online when we were searching to know if we can specify the underlying type for the C++98 enums.

I will repeat what I quoted for halex2005's comment - It seems all problems, in a very idealistic point of view, seem to have a similar solution. The similarity is flavored and limited by the mind that searches to solve the problem. The idea and the solution matures by time.

Oh, by the way, C# provides strongly typed enums with the facility to specify the underlying type. C# provides everything! C++ waited for decades and has finally planned to offer the strongly typed enums in C++0x.

P.S: See previous post.

Type Safe Logger

I and Sanjeev have published an article - Type Safe Logger For C++ - at CodeProject. Every bit of work is tiresome or little ugly in C++. So is logging - writing application diagnostics to console, file etc. The printf style of outputting diagnostics is primitive and not type safe. The std::cout is type safe but does not have a format specification. Besides that, printf and std::cout know to write only to the console. So we need a logging mechanism that provides a format specification, is type safe and log destination transparent. So we came up with this new Logger to make C++ programmers happy.

Following is a short introduction excerpt of the article:-

Every application logs a whole bunch of diagnostic messages, primarily for (production) debugging, to the console or the standard error device or to files. There are so many other destinations where the logs can be written to. Irrespective of the destination that each application must be able to configure, the diagnostic log message and the way to generate the message is of our interest now. So we are in need of a Logger class that can behave transparent to the logging destination. That should not be a problem, it would be fun to design that.......Read more.

As always your comments are most valuable.

And oh, Happy Logging!

Simple Array Class For C++

This is a simple array like class for C++, which can be used as a safe wrapper for accessing a block of memory pointed by a bare pointer.

#pragma once

template<typename T> class Array
{
private: T* _tPtr;
private: size_t _tSize;
private: bool _isOwner;

public: Array(int size, bool isOwner = true) : _isOwner(isOwner)
        {
           _tSize = size;
           _tPtr = new T[size]; 
        }

public: Array(T* tPtr, size_t count, bool isOwner = true) : _isOwner(isOwner)
        {
           if (NULL == tPtr)
           {
              throw std::exception("Specified T* pointer is NULL.");
           }

           this->_tSize = count;
           this->_tPtr = tPtr;
        }

public: template<typename TStlContainerType> Array(const TStlContainerType& stlContainer, bool isOwner) : _isOwner(isOwner)
        {
           _tSize = stlContainer.size();
           _tPtr = new T[_tSize];

           int index = 0;
           for (TStlContainerType::const_iterator iter = stlContainer.begin(); iter != stlContainer.end(); ++iter, ++index)
           {
              _tPtr[index] = *iter;
           }
        }

public: T operator[](size_t index)
        {
           if  (index > Count())
           {
              throw std::exception("Index out of bounds");
           }

           return _tPtr[index];
        }

public: operator T* const() const
        {
           return _tPtr;
        }

public: T* const Get() const
        {
           return _tPtr;
        }

public: operator const T* const()
        {
           return _tPtr;
        }

public: int Count() const
        {
           return static_cast<int>(this->_tSize);
        }

public: bool IsOwner() const
        {
           return this->_isOwner;
        }

public: void CopyTo(T* tPtr, size_t copySize)
        {
           memcpy(tPtr, this->_tPtr, copySize);
        }

public: ~Array()
        {
           if (IsOwner())
           {
              delete _tPtr;
           }
        }
};

Disclaimer: It is not aimed to replace the STL containers.

CComPtr Misconception !!!

This is about a killer bug identified by our chief software engineer in our software. What was devised for ease of use and write smart code ended up in this killer defect due to improper perception. Ok, let us go!

CComPtr is a template class in ATL designed to wrap the discrete functionality of COM object management - AddRef and Release. Technically it is a smart pointer for a COM object.

void SomeMethod()
{
	CComPtr siPtr;
	HRESULT hr = siPtr.CoCreateInstance(CLSID_SomeComponent);
	siPtr->MethodOne(20, L"Hello");
}

Without CComPtr, the code wouldn't be as elegant as above. The code would be spilled with AddRef and Release. Besides, writing code to Release after use under any circumstance is either hard or ugly. CComPtr automatically takes care of releasing in its destructor just like std::auto_ptr. As a C++ programmer, we must be able to appreciate the inevitability of the destructor and its immense use in writing smart code. However there is a difference between pointers to normal C++ objects and pointers to COM objects; CComPtr and std::auto_ptr. When you assign one auto_ptr to another, the source is no more the owner of the object pointing to. The ownership is transferred to the destination. Whereas when a CComPtr is assigned to another, the reference count of the target COM object increases by one. And the two CComPtrs point to the same COM object. Changes made via one CComPtr object can be realized when the object is accessed via the other CComPtr. Release must be called on each CComPtr instance (to completely release the COM object). All fine, lets us see some code.

void SomeOtherMethod()
{
	CComPtr aPtr;
	InitAndPopulateObject(aPtr);

	int itemCount = 0;
	HRESULT hr = aPtr->GetCount(&itemCount);
	_ASSERTE(SUCCEEDED(hr));

	for (int i = 0; i < itemCount; ++i)
	{
		TCHAR szBuffer[128] = { 0 };
		sprintf_s(szBuffer, sizeof(szBuffer), "Key%ld", i);
		CComBSTR bstrKey(szBuffer);

		int iValue = 0;
		hr = aPtr->GetItem(bstrKey, &iValue);
		_ASSERTE(SUCCEEDED(hr));

		std::cout << bstrKey << " - " << iValue;
	}
}

void InitAndPopulateObject(CComPtr bPtr)
{
	HRESULT hr = ptr.CoCreateInstance(CLSID_Hashtable);
	
	_ASSERTE(SUCCEEDED(hr));

	for (int i = 0; i < 100; ++i)
	{
		TCHAR szBuffer[128] = { 0 };
		sprintf_s(szBuffer, sizeof(szBuffer), "Key%ld", i);
		bPtr->Add(szBuffer, i);
	}	
}

CComPtr saved a whole of code as explained above. But my application was always crashing in the SomeOtherMethod on the line where GetCount method is called on the COM object initialized one line above. So I am passing a CComPtr to InitAndPopulateObject, which is supposed to create me my COM object and fill it with some information I expect. Since I am passing a CComPtr, a return value is not needed. Looks fine, but the application crashed.

People are often misled with many things in programming mostly because they stick to the prime way of its use. CComPtr, in most cases, is used for creating a COM object, passed around across various sections in the code where AddRef and Release is done under the covers until the COM object dies a pleasant death. People tend to forget that the member in CComPtr (named poorly as p) is the one that is actually pointing to the COM object. So aPtr.p, whose value is 0x0000 (NULL), is passed by value and copied to bPtr.p. When the COM object is created using bPtr, it is bPtr.p which is assigned the COM object's address, say 0x23456789; whereas aPtr.p remains NULL even after InitAndPopulateObject returns. Hence the application was crashing because of null pointer access.

The problem might be obvious in the above few lines of clear code. It sure was very tough to locate and reason it in our huge code base.

OrderedThreadPool - Task Execution In Queued Order !!!

I would not want to write chunks of code to spawns threads and perform many of my background tasks such as firing events, UI update etc. Instead I would use the System.Threading.ThreadPool class which serves this purpose. And a programmer who knows to use this class for such cases would also be aware that the tasks queued to the thread pool are NOT dispatched in the order they are queued. They get dispatched for execution in a haphazard fashion.

In some situations, it is required that the tasks queued to the thread pool are dispatched (and executed) in the order they were queued. For instance, in my (and most?) applications, a series of events are fired to notify the clients with what is happening inside the (server) application. Although the events may be fired from any thread (asynchronous), I would want them or rather the client would be expecting that the events are received in a certain order, which aligns with the sequence of steps carried out inside the server application for the requested service. So sequential execution of the queued tasks is not something one must not wish for.

Enough talking.......eat code.

using System;
using System.Collections.Generic;
using System.Diagnostics;

namespace System.Threading
{
   struct ThreadPoolTaskInfo
   {
      public readonly WaitCallback CallbackDelegate;
      public readonly object State;

      public ThreadPoolTaskInfo(WaitCallback wc, object state)
      {
         Debug.Assert(wc != null);
         CallbackDelegate = wc;
         State = state;
      }
   }

   class OrderedThreadPool
   {
      private Queue workItemQ = new Queue();

      public void QueueUserWorkItem(WaitCallback wcbDelegate, object state)
      {
         lock (workItemQ)
         {
            workItemQ.Enqueue(new ThreadPoolTaskInfo(wcbDelegate, state));

            if (workItemQ.Count == 1)
            {
               ThreadPool.QueueUserWorkItem(LoopWork);
            }
         }
      }

      private void LoopWork(object notUsed)
      {
         WaitCallback wcb = null;
         object state = null;

         lock (workItemQ)
         {
            if (workItemQ.Count == 0)
            {
               return;
            }

            ThreadPoolTaskInfo tptInfo = workItemQ.Dequeue();
            state = tptInfo.State;
            wcb = tptInfo.CallbackDelegate;
            Debug.Assert(wcb != null);
         }

         try
         {
            wcb(state);
         }
         finally
         {
            ThreadPool.QueueUserWorkItem(LoopWork, notUsed);
         }
      }
   }
}

The above class wraps the System.Threading.ThreadPool and offers the facility of execution of tasks in the order they are queued. Hope that is useful!

Settling Casting Restrictions !!!

Remember the Casting Restrictions we discussed a while back, let us settle that now. So we have some code like this:

int i = 100;
object obj = i;
long l = (long)obj;

And an invalid cast exception while casting 'obj' to long. It is obvious that we are not changing the value held by obj, but just reading it. Then why restrict such casting. Let us disassemble and see what we got.

    .locals init (
        [0] int32 i,
        [1] object obj,
        [2] int64 l)
    L_0000: nop
    L_0001: ldc.i4.s 100
    L_0003: stloc.0
    L_0004: ldloc.0
    L_0005: box int32
    L_000a: stloc.1
    L_000b: ldloc.1
    L_000c: unbox.any int64
    L_0011: stloc.2
    L_0012: ret

Oh, there we see something interesting - unbox. So the C# compiler uses the unbox instruction to retrieve the value from obj while casting; it does not use Convert.ToInt64 or similar mechanism. That is why the exception was thrown.

From MSDN:

Unboxing is an explicit conversion from the type object to a value type or from an interface type to a value type that implements the interface. An unboxing operation consists of:

      • Checking the object instance to make sure it is a boxed value of the given value type
      • Copying the value from the instance into the value-type variable

So we are blown at step 1 of the unbox operation. Let us play with what we have for now, and stop bugging why was unbox meant to be like that.

The WD Anti-Propaganda Campaign !!!

Thanks to the internet. If nobody else bothers or understands what loss of data means, you can shout it aloud here. I lost 500GB of data - every moment of my personal and professional life captured in bits and bytes.

It is a Western Digital Premium Edition external hard disk (USB/Firewire). I bought it despite my friend warning of bad sectors and hardware issues that WD is known to have. As with any story, one fine morning, I was copying some songs, pictures from my pen drive to the hard disk. All of a sudden, the hard disk and my laptop hung up. I restarted the system thinking I would make fresh start. But to my dismay, all my drives on the hard disk had vanished like dust. I tried connecting and reconnecting a few times, the drives showed up once or twice like a sick man's last few breadths.

The other similar incident was the hard disk crash at my office last month. It was also a WD 160GB hard disk (IDE). And it took with it more than 5 years of email storage, project documents, code and other official data. My friend's warning until it happened seemed to be a rumour but is proven now. I think it is fair on my part to make polite anti-propoganda against WD.

The drive not only took with it gobs and gobs of data, it took the proof of my existence.

Casting Restrictions ???

We all know that the runtime can detect the actual type of a System.Object instance. The primitive data types provided by the runtime are compatible with one another for casting (assuming that we do not truncate the values). So if I have an int, it can be cast to long or ulong. All that is fine. Watch this:-

interface IAppDataTypeBase
{
	// Other methods
	object GetValue();
}

Since

IAppDataTypeBase represents the mother of all types of data in my application, I have made GetValue to return the value as object (I could have used generics, that is for another day!).

IAppDataTypeBase longType = GetLongInstanceFromSomeWhere();
int i = (int)longType.GetValue();

So are we discussing any problems here? Yes, we are. The problem is that the value returned by GetValue -

System.Object - despite being inherently long cannot be cast to an int. It would result in an 'Specified cast is invalid' exception.

If an object is one of the primitive types, it can only be cast to its actual type. In the above case, the object returned by GetValue can only be cast to long, and nothing else. The user defined data types do not have this restriction if the base type and target type are related.

class X { };
class DX : X { };
class Y { };

If GetValue returns an instance of DX, it can be cast to X or any of its base interfaces (if any). The same goes good for structs too.

So why do we have this casting restriction for the primitive types? Was this unintentional or is there an advanced CLR internals webpage somewhere talking about this? Probably fixed in C#4.0? Until I learn why, the question is open.

Understanding (ref)erences !!!

Let us take a look at the following piece of code:-

public void Operate(IList iList2)
{
    iList2 = new List();
    iList2.Add(1);
    iList2.Add(2);
    iList2.Add(3);
}

public static void Main()
{
    IList iList= new List();
    iList.Add(10);

    Operate(iList);

    Console.WriteLine(iList[0].ToString());
}

Be thinking about what would the above program print to the console ? And that is what we are going to talk about in this post - simple but subtle.

I saw this code at CodeProject discussions. The author was confused with why was the program printing 10 instead of 1. I am writing about this since the 'gotcha' was not highlighted in the discussion.

So we passed the reference 'iList' to the function which is supposed to make it point to the 'List' that it creates and so must be printing 1. Well, a C++ programmer knowing how to program in C# would have said 'Gotcha' already. A reference (in C#), equivalent to a pointer in C++, is an entity that stores the address of an object in heap and accesses it using this address. So when we pass a reference (by value) to a function, then we are passing this address value. That is captured in another 4 byte variable local to that function; so creating assigning inside the function will make iList2 point to newly created object - iList and iList are two different reference pointing to the same object. So if you want to transmit the effect of the changes you make to the List inside the function, pass it by reference - use ref keyword.

Now the fun part !!! Let us try writing the same stuff in C++:-

// This function will not alter the source pointer
public void Operator(IList* pList)
{
	pList = new List();
	pList->Add(1);
	pList->Add(2);
	pList->Add(3);
}

// This function will affect the source; similar to using ref in C#
// 1) const IList*& pList - Can make pList point elsewhere but cannot modify the existing object
// 2) IList* const &pList - pList cannot point to anywhere else but can modify the existing object
public void Operator(IList*& pList)
{
	pList = new List();
	pList->Add(1);
	pList->Add(2);
	pList->Add(3);
}

Hope that was fun !!!

(Generic) Type Inference & Intellisense !!!

Consider this:-

class CmdBase
{
	public char ReturnValueFromBase()
	{
	}
}

class ReadCmd : CmdBase
{
	// No time for ctor coding

	public bool ReturnValueFromRead()
	{
	}
}

class WriteCmd : CmdBase
{
	// No time for ctor coding

	public int ReturnValueFromWrite()
	{
	}
}

// And imagine exactly 212 other such commands, just kidding; the above two would do fine for now.

Now let us say we want to write a method which could execute any given command (a derivate of CmdBase), and as you can see must be to retrieve the command specific return value (via the ReturnValeFromXXX method). So...

T Execute<T>(T cmdObj) where T : CmdBase
{
	// Execute the command by any way you like
return cmdObj; }

You can call Execute in one of the following ways:-

Execute<ReadCmd>(new ReadCmd(/*parameters*/).ReturnValueFromRead();

or

Execute(new ReadCmd(/*parameters*/).ReturnValueFromRead();

Is there any difference in the above two ways ? Ideally, none. Wierd that the intellisense does not infer the exact type for the second way; instead assumes the type passed as CmdBase. But the code compiles and executes fine (as written and as expected). So is that a bug in the intellisense or is that intentional ?

Scrollable Image Viewer !!!

Every time I find it hard with these Microsoft guys. They are smart, I agree. But frequently they miss out vital features in the products they deliver. Something like the picture box control without the facility to view the image with scrolling. I had this situation where I needed a (picture box like) control which must be able to display images in one of the two modes - Blown-up or Full Image.

So what are these two modes ? That is a story a bit big for this post. I have posted an article in Code Project that talks about our control Scrollable Image Viewer control. Go have a look at it and sure you may find it useful in your applications too. In that case, you are free to use and distribute (CPOL). Oh......bugs are your burden.

If there are any bugs or if you have any suggestions about other improvements in code, let me know.

Extension Methods - A Polished C++ Feature !!!

Extension Method is an excellent feature in C# 3.0. It is a mechanism by which new methods can be exposed from an existing type (interface or class) without directly adding the method to the type. Why do we need extension methods anyway ? Ok, that is the big story of lamba and LINQ. But from a conceptual standpoint, the extension methods establish a mechanism to extend the public interface of a type. The compiler is smart enough to make the method a part of the public interface of the type. Yeah, that is what it does, and the intellisense is very cool in making us believe that. It is cleaner and easier (for the library developers and for us programmers even) to add extra functionality (methods) not provided in the type. That is the intent. And we know that was exercised extravagantly in LINQ. The IEnumerable was extended with a whole lot set of methods to aid the LINQ design. Remember the Where, Select etc methods on IEnumerable. An example code snippet is worth a thousand words:-

static class StringExtensions
{
	/// 
	/// 'this' decorator signifies that this is an extension method.
	/// It must be appear only on a public static method.
	/// Such a method is added to the public interface of the type following the 'this' decorator.
	/// 
	public static int ToInteger(this string s)
	{
	    return Convert.ToInt32(s);
	}
	
	public static string Left(this string s, int position)
	{
	    return s.Substring(0,position);
	}
	
	public static string Right(this string s, int position)
	{
	    return s.Substring(s.Length - position);
	}
}

You might be aware of all this hot news. But our topic of the day is neither Extension Methods nor LINQ. It is something that dates back to C++. And you will see at the end of this post that extension methods are a polished version of a C++ principle. Ok, let us try to read some code:-

int Add(SomeClass& sc, int x)
{
   // Let us get to here a little later.
}

class SomeClass
{
private: int m_nNum;
public: void SomeMethod(int n);
public: int Num const
	{
		return this->m_nNum;
	}
};

The code is simple - We have a class called SomeClass and a global function. The global function takes a SomeClass instance by reference and an integer by value. The intent of the function is to add x with m_nNum. But whether to save it to m_nNum or just return is a topic we will deal in a little while. But do we understand that the Add function and SomeClass are closely related ?

There are two principles to know in C++ to understand the relation.

Interface Principle

For a class X, all functions, including free functions, that both (a) "mention" X, and (b) are "supplied with" X are logically part of X, because they form part of the interface of X.

* Supplied with X means that the function is provided (distributed with) in the same header file as X.

So now, Add mentions SomeClass and (to keep the discussion short assume that it) is supplied with SomeClass. If that, then Add is a part of the public interface of X. That should convince you.

Koenig Lookup

When an unqualified name is used as the postfix-expression in a function call, other namespaces not considered during the usual unqualified lookup may be searched, and namespace-scope friend function declarations not otherwise visible may be found. These modifications to the search depend on the types of the arguments.

Those are lines from the C++ standard and must be tough to understand like the verses in the Bible. So let us talk our language to understand that:

namespace CPP
{
     class SomeClass { };
     void Foo(SomeClass);
}

CPP::SomeClass sc;

void main() 
{
     Foo(sc);
}

Will you be still surprised that Foo(sc) call will link to the CPP::Foo ? Don't be. That is what the cryptic lines above talks about. Ok, one more example:

namespace CPP
{
	class SomeClass { };
}

void Foo(CPP::SomeClass);
void main() { CPP::SomeClass sc; Foo(sc); // Got it ? }

So now down to my point, Extension Methods is a polished version of the Interface Principle (or Koenig Lookup) in C++. The facility has been in C++ for a long time but not sure if exercised well (and wisely). Had the intellisense been intelligent enough, C++ would claimed it a mighty feature. Since the C++ IDE has been the same sucking way for a long time now, C++ got the wrong outlook - a hard programming language.

Hey C#, No hard feelings. It is just a perspective. Either way, My compiler compiles your compiler.