Tuesday, July 17, 2012

Four interesting questions - Part III

This is the third post in this series. This question deals with C/C++ concurrency. The code looked somewhat similar to this and the question was whether this Singleton class is thread safe? Explain why not and how to fix or if yes why?

static Singleton * Singleton::instance();

Singleton * Singleton::instance() {
if (pInstance == 0) {
Lock lock;
pInstance = new Singleton;
}
return pInstance;
}

Put it in a another way when the static instance() member function is called in two threads concurrently will the Singleton property of the class hold. (ie. only a single instance will be created). The problem in here is when the instance() member function is called in two threads concurrently, it can be stopped in arbitrary locations (Actually this gets converted into machine code and it get stopped).

If you look closely you will find out that the following can occur. If two threads are there, first one tests to see whether pInstance equals NULL, afterwards which is immediately moved into runnable state (gives up CPU), then the second thread is scheduled and finds out that pInstance is NULL goes on to initiates it. When the first thread is resumed it will also create a new Singleton object (it resumes where it was stopped) and violates the singleton property. So far so good.
The code above is wrong. But why would someone write awkward code as above in first place with a lock inside a condition. Its not obvious what he/she was trying to do unless you know the double checked locking design pattern. The idea hear is if you write something like,

static Singleton * Singleton::instance();

Singleton * Singleton::instance() {
Lock lock;
if (pInstance == 0) {
pInstance = new Singleton;
}
return pInstance;
}

every time you access the singleton object (if declared private in the class) the member function should acquire the lock. (Writing it without locks provides no thread safety). So the intention of the first code segment is to reduce locking (Need to lock only when creating). I think the desired answer for this question was:

static Singleton * Singleton::instance();

Singleton * Singleton::instance() {
if (pInstance == 0) {
Lock lock;
if (pInstance == 0) {
pInstance = new Singleton;
}
}
return pInstance;
}

Where after locking it checks whether the object is still NULL, if so just return the original object.

Interesting, later I came to know that there is no correct way to write a answer to this problem in standard C++. Above solution may break in a standard complaint C/C++ implementation. It seems this is because C/C++ does not define a concurrency memory model or threading in standards but in most platforms the above code will not fail provided pInstance is declared volatile. One of the excellent articles describing a way this will break is by Scott Meyers and Andrei Alexandrescu two leading authorities on C++.

Monday, July 16, 2012

Four interesting questions - Part II

In a previous post I told, I am going to writing a series of articles on some interesting IT related questions and has already written one. This is the second one of that series. Its a C++ programming question. I found this question on a C/C++ answer script that was given for me to mark (From what I can remember no one had come up with a right answer).

The question was to write a C++ function that accepts any STL container and print all the members in it.

The point here is Standard Template Library containers are not derived from a common super class like in Java Collections. If it was the case we could simply write a function accepting that type. (In the case of Java most Generic containers with the exception of maps are derived from the Iterable interface). In Java we can write something similar to this one. (This will not handle maps but we are very close to it :))

public static
<T> void print(Iterable<T> iterable) {
for(T t : iterable) {
System.out.println(t);
}
}

How can we do this in C++ for the STL containers. The key for the answer is C++ Templates introduces a different paradigm of programming (based on static polymorphism). We can write a templated function.
But how?. We need to know a bit more about iterators. Iterators are the common way of accessing the elements in a STL container. Every STL container exposes its own iterator type through the public typedef, const_iterator and iterator. Although the actual iterators are of different types (read about how STL iterators are classified based on their contract) they have basic behaviors (overloaded operators that will do it). Based on these capabilties we can categorize iterators into a hierarchy. If we only use the operators of a fundamental iterator (i.e. forward Iterator
we can accomplish the former task).

The rest is about expressing that in valid C++ syntax. Here the typename key word is used to represent the partial type which is validated when the function is instantiated again for the actual container type. Note the typename keyword should be used before T otherwise it is interpreted as a static member const_iterator of class T.

template
<typename T> void print(T & con) {
typename T::const_iterator it = con.begin();
while(it != con.end()) {
std::cout << *it << std::endl;
++it;
}
}

Even a map can be passed as an argument provided that << operator is implemented for std::pair. The answer for the question is of few lines but it opens the way to a world of new possibilities. C++ templates are not only about writing containers of general types. Its much much more.

Friday, July 6, 2012

Four interesting questions - Part I

This series of posts discusses about four interesting IT technical questions, that I have come across. For some of you these will be not so interesting but for me they have all have a significance. When I first came across them, all of a sudden and needed to solve them (with out the aid of Internet). But later when doing general reading I found interesting further explanations.

01. What should a constructor of a immutable Java class do when it receives a reference argument (another object)? (The question was not that direct it contained a small code segment. This was a interview question.)

First a bit of background. A immutable object is a object that cannot be modified after it is created. A class defining such a class is a immutable class. Why do we need to create immutable objects in first place? Is there a valid reason for it. Such a object is costly, and we need to recreate if it needs a change. But immutable classes offer certain advantages over mutable ones the main one being the ease in how we can handle synchronization in multi threaded programs.

The point here is if we strait away assign the reference into a internal variable it leads to potential problems. If the passed object is not immutable, and we change it (this is possible since we have a reference for it) the compound object is changed. So we should consider creating a private copy for the object of such references so that we can be sure that we had not leaked a reference out.

It sounds simple, but this is not the only point we can get wrong. There are a good number of pitfalls when writing immutable classes. (Making every thing member private and final, not exposing mutator methods, stop inheriting from the class.. etc) This really applies to library quality code and if you really care about your code quality. A respected authority in Java, Joshua Bloch has dedicated a section in his famous book Effective Java on writing such classes (Item 15).

Sunday, June 17, 2012

When the big Oh matters : A case of O(n) vs. O(log n)

In algorithms classes we are thought to quantify algorithms based on big Oh values. But in practice we rarely write new algorithms and analyze them.

Recently, I was reading kernel source code and found a interesting algorithm.

The task here is to find the value of iterations a loop can execute in a jiffie. Provided that the total number of jiffies is available as a global variable, we can calculate it in the following manner.

As a new jiffie starts (lets say it t) enter a busy waiting loop of i iterations.
After the loop ends check whether we have passed the jiffie t.
If not do same with a increment until we pass the current jiffie.

The value of i when we out last jiffie t is the value we need. But if linux does so it will take a long time too boot. (This is a O(n) algorithm). The basic idea for improving this algorithm comes from binary search. If the value we are searching for is between 0 and 2^32 assume it is 2^31 (middle number) and test. From the test we know whether it is smaller than 2^31 or greater than it. Know we have a smaller range, and we choose the middle number of it and come up with a much smaller number.

The actual implementation uses the same idea with a little twist:

1. First find the value of n where the loops_per_jiffie is between 2^(n - 1) and 2^n. This is found by starting at 4096 and each time doubling the value. If the wait outlasts the time of jiffie on n = 18 we know that the it is between 2^17 and 2^18.

2. Find the lower order bits in the significance order using the same method. First assume each bit is 1. If the wait outlast a jiffie it should be 0. Likewise guess all the rest values.

I think this is a good example why computer science knowledge is important for programmers.

Tuesday, June 15, 2010

Using apt-get Offline

As I promised in this post I am going to write how to use apt-get offline. This is based on my experience as a student. After succeeding with apt-get I tried this with yum too and was able to do it in Fedora too. (I am not going to write about that because know I rarely use Fedora).

I used the following in steps in Ubuntu (can be used in any Debian based system) to install software using off line system.

1. Edit /etc/apt/source.lst to enable repositories you need. The following link gives you details on repositories (https://help.ubuntu.com/community/Repositories/Ubuntu). Read this first if you want to know more about repos (you will be offline when doing real stuff.)

2. Create a list of URLs that will download repository meta data. These files contain a list of all software available on the repository and their dependencies.
apt-get -qq --print-uris update | awk '{print $1}' | sed "s/'//g" > download

I will not go on to explain what the command does. But after executing the command the download file will contain a list of urls you need to download.

3. Download these files using a internet enabled machine. (use wget on Windows or Linux, use the url list).

4. Copy the downloaded files into /var/lib/apt/lists/. This will make the offline computer knowledgeable about the contents of the repositories. (These days the files are bzipped to save space and you may need to unzip them)

5. Now generate a list of all dependant files in the offline Linux box using
apt-get install -qq --print-uris < package_to_install> | awk '{print $1}' | sed "s/'//g" > packages

6. Download these files and copy them into /var/cache/apt/archives/

7. Know since all files are there install them.
apt-get install < package_to_install>

You will only need to follow steps 5 through 7 after the initial work is done.

PS: I used this method when I was a University student, where mobile broadband was not so common, expensive (may for a University student). It seems know for most students this is affordable. Yet these steps are useful if you want to speed up a installation by using the cached .deb files.

Wednesday, June 9, 2010

Installing software offline in Linux

Three or four years back as a university student when mobile broadband was not there and Internet connections cost more I started using Linux. The biggest challenge at that time for me was using it offline. Windows users can easily install a program offline just by clicking on the executable. Some Linux distributions have a collection of DVDs that have extra software which will allow at least old versions of some software to be installed offline but not Ubuntu (the version of Linux that I used that time). It seemed like there is no easy solution for installing new programs in Ubuntu offline.
There are several ways in which you can install a program in Linux. The first is to get the source of the program and compile it your self. This is not a practical solution at all for the general user. If you are to compile a usable program you should know what components are there in the program and how to enable them and disable them. For example compiling the Apache Web server requires some knowledge about features and decide whether they should be built or not and if they are to be built in what mode(static or dynamic). Making matters even worse you will need to find dependent libraries and install them before hand. I would not recommend this method. Use it only if you are making changes in sources.
The second method is to use a pre-compiled executables for Linux or generally Unix published by the program developers(as in the case of Sun JDK). The advantage is most of the extra libraries needed are packed in the executable and it will work for you. The downside is not all projects provide such executables and it does not get integrated with the system tightly.
The final method is to use the distributions package management system. Most Linux distributions has adapted such a system like Debian's .deb or Red Hat's .rpm as mechanisms to install new software. Installing a .deb will copy some files into your system and run some initializing script and most importantly store the details of the files added. These packages are structured as small units and they depend on many others. So the actual packages that should be installed to get something working depends on the current system state and can be tedious if installed manually without a tool which automatically does this (such as apt-get or aptitude). When offline these tools fail since they cannot download the necessary files.
In the next post I will write how I did this in Ubuntu.

First Post

Though this is the first post in this blog this is not my first blog post. Prior to this I had a blog and after three or four posts I stopped blogging. I was in doubt whether I could provide any original content with value for others. Anyway now I decided to start fresh again. I think this is a wise decision. It is always better to keep logs by individuals of interesting views and work. As people we all learn from others experience and part of our development is due to this knowledge transfer.

Hope to keep you updated.