Tuesday, July 17, 2012

Four interesting questions - Part III

This is the third post in this series. This question deals with C/C++ concurrency. The code looked somewhat similar to this and the question was whether this Singleton class is thread safe? Explain why not and how to fix or if yes why?

static Singleton * Singleton::instance();

Singleton * Singleton::instance() {
if (pInstance == 0) {
Lock lock;
pInstance = new Singleton;
}
return pInstance;
}

Put it in a another way when the static instance() member function is called in two threads concurrently will the Singleton property of the class hold. (ie. only a single instance will be created). The problem in here is when the instance() member function is called in two threads concurrently, it can be stopped in arbitrary locations (Actually this gets converted into machine code and it get stopped).

If you look closely you will find out that the following can occur. If two threads are there, first one tests to see whether pInstance equals NULL, afterwards which is immediately moved into runnable state (gives up CPU), then the second thread is scheduled and finds out that pInstance is NULL goes on to initiates it. When the first thread is resumed it will also create a new Singleton object (it resumes where it was stopped) and violates the singleton property. So far so good.
The code above is wrong. But why would someone write awkward code as above in first place with a lock inside a condition. Its not obvious what he/she was trying to do unless you know the double checked locking design pattern. The idea hear is if you write something like,

static Singleton * Singleton::instance();

Singleton * Singleton::instance() {
Lock lock;
if (pInstance == 0) {
pInstance = new Singleton;
}
return pInstance;
}

every time you access the singleton object (if declared private in the class) the member function should acquire the lock. (Writing it without locks provides no thread safety). So the intention of the first code segment is to reduce locking (Need to lock only when creating). I think the desired answer for this question was:

static Singleton * Singleton::instance();

Singleton * Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
pInstance = new Singleton;
}
}
return pInstance;
}

Where after locking it checks whether the object is still NULL, if so just return the original object.

Interesting, later I came to know that there is no correct way to write a answer to this problem in standard C++. Above solution may break in a standard complaint C/C++ implementation. It seems this is because C/C++ does not define a concurrency memory model or threading in standards but in most platforms the above code will not fail provided pInstance is declared volatile. One of the excellent articles describing a way this will break is by Scott Meyers and Andrei Alexandrescu two leading authorities on C++.

Monday, July 16, 2012

Four interesting questions - Part II

In a previous post I told, I am going to writing a series of articles on some interesting IT related questions and has already written one. This is the second one of that series. Its a C++ programming question. I found this question on a C/C++ answer script that was given for me to mark (From what I can remember no one had come up with a right answer).

The question was to write a C++ function that accepts any STL container and print all the members in it.

The point here is Standard Template Library containers are not derived from a common super class like in Java Collections. If it was the case we could simply write a function accepting that type. (In the case of Java most Generic containers with the exception of maps are derived from the Iterable interface). In Java we can write something similar to this one. (This will not handle maps but we are very close to it :))

public static
<T> void print(Iterable<T> iterable) {
for(T t : iterable) {
System.out.println(t);
}
}

How can we do this in C++ for the STL containers. The key for the answer is C++ Templates introduces a different paradigm of programming (based on static polymorphism). We can write a templated function.
But how?. We need to know a bit more about iterators. Iterators are the common way of accessing the elements in a STL container. Every STL container exposes its own iterator type through the public typedef, const_iterator and iterator. Although the actual iterators are of different types (read about how STL iterators are classified based on their contract) they have basic behaviors (overloaded operators that will do it). Based on these capabilties we can categorize iterators into a hierarchy. If we only use the operators of a fundamental iterator (i.e. forward Iterator
we can accomplish the former task).

The rest is about expressing that in valid C++ syntax. Here the typename key word is used to represent the partial type which is validated when the function is instantiated again for the actual container type. Note the typename keyword should be used before T otherwise it is interpreted as a static member const_iterator of class T.

template
<typename T> void print(T & con) {
typename T::const_iterator it = con.begin();
while(it != con.end()) {
std::cout << *it << std::endl;
++it;
}
}

Even a map can be passed as an argument provided that << operator is implemented for std::pair. The answer for the question is of few lines but it opens the way to a world of new possibilities. C++ templates are not only about writing containers of general types. Its much much more.

Friday, July 6, 2012

Four interesting questions - Part I

This series of posts discusses about four interesting IT technical questions, that I have come across. For some of you these will be not so interesting but for me they have all have a significance. When I first came across them, all of a sudden and needed to solve them (with out the aid of Internet). But later when doing general reading I found interesting further explanations.

01. What should a constructor of a immutable Java class do when it receives a reference argument (another object)? (The question was not that direct it contained a small code segment. This was a interview question.)

First a bit of background. A immutable object is a object that cannot be modified after it is created. A class defining such a class is a immutable class. Why do we need to create immutable objects in first place? Is there a valid reason for it. Such a object is costly, and we need to recreate if it needs a change. But immutable classes offer certain advantages over mutable ones the main one being the ease in how we can handle synchronization in multi threaded programs.

The point here is if we strait away assign the reference into a internal variable it leads to potential problems. If the passed object is not immutable, and we change it (this is possible since we have a reference for it) the compound object is changed. So we should consider creating a private copy for the object of such references so that we can be sure that we had not leaked a reference out.

It sounds simple, but this is not the only point we can get wrong. There are a good number of pitfalls when writing immutable classes. (Making every thing member private and final, not exposing mutator methods, stop inheriting from the class.. etc) This really applies to library quality code and if you really care about your code quality. A respected authority in Java, Joshua Bloch has dedicated a section in his famous book Effective Java on writing such classes (Item 15).

Sunday, June 17, 2012

When the big Oh matters : A case of O(n) vs. O(log n)

In algorithms classes we are thought to quantify algorithms based on big Oh values. But in practice we rarely write new algorithms and analyze them.

Recently, I was reading kernel source code and found a interesting algorithm.

The task here is to find the value of iterations a loop can execute in a jiffie. Provided that the total number of jiffies is available as a global variable, we can calculate it in the following manner.

As a new jiffie starts (lets say it t) enter a busy waiting loop of i iterations.
After the loop ends check whether we have passed the jiffie t.
If not do same with a increment until we pass the current jiffie.

The value of i when we out last jiffie t is the value we need. But if linux does so it will take a long time too boot. (This is a O(n) algorithm). The basic idea for improving this algorithm comes from binary search. If the value we are searching for is between 0 and 2^32 assume it is 2^31 (middle number) and test. From the test we know whether it is smaller than 2^31 or greater than it. Know we have a smaller range, and we choose the middle number of it and come up with a much smaller number.

The actual implementation uses the same idea with a little twist:

1. First find the value of n where the loops_per_jiffie is between 2^(n - 1) and 2^n. This is found by starting at 4096 and each time doubling the value. If the wait outlasts the time of jiffie on n = 18 we know that the it is between 2^17 and 2^18.

2. Find the lower order bits in the significance order using the same method. First assume each bit is 1. If the wait outlast a jiffie it should be 0. Likewise guess all the rest values.

I think this is a good example why computer science knowledge is important for programmers.