Archive for November, 2012

My Home-Based Web and Mail Server: Gone!

At home I had a server that hosted my company’s website and the mail server that processed mail directed at (me at) my company. A nice small Windows 2008 server that was always on, never crashed, never had any other problems, always correctly finished its backup, it just did its job. Implemented on my old (2002, I think) Dell 360 Workstation. The only problem we had with it was that it had to be close to the ADSL connection which is next to our dinner table. The Dell 360 is a relatively silent computer, but, you can hear it, and after a few years (to me, for some family members it was already after a few months) the noise gets annoying, and changes have to be made.

One of the joys of a home-based server, on an old pc, is that it is free: no costs are involved except for some electricity. So, the challenge became to replace the home-based server with a solution that is also free, or almost free. So that is what this blog post is really about. How can you get basic IT facilities for your home-based startup company (almost) for free. Home-based means, among other things, that internet access already exists.

So, let’s start with the cost you do have to make. You will have to register your company. In the Netherlands that will cost you about € 40,- a year. You also need (I suppose) a domain name to present your company on the internet. You don’t exist if you’re not on the internet, right? I registered a domain name at GoDaddy’s for less than $ 11,- a year.

The next step is to get basis services like e-mail, an agenda, and a website. There are many possible solutions, but since I am a Microsoft … , hmm yes indeed, What is my relation to Microsoft? Well, there are a few large software (development) platforms. There is the open source solution: Linux with C++, Java, and very much more; there is the Apple platform; the Google rising platform, and of course the Microsoft Windows platform with .Net, VC++, and also much more. There are other platforms, also large and respectable, but I’m not going to fill this blog post enumerating software development platforms to some degree of completeness. The point is, Microsoft Windows is the market leader, that’s why I develop software to run on it. Therefore, I invested time to get to know the platform and a variety of products, and therefore I consume Microsoft products more than from any other platform or supplier. So, if I need e-mail and calendar services I turn to Microsoft.

It might not be well known, but at Live Domains you can customize the Live Services around your own domain name, and insert your company’s look and feel. You can add 500(!) members to your domain, all with their own e-mail address, calendar, etc. All for free. You can edit the DNS data at GoDaddy’s so e-mail will indeed be sent to your e-mail address at the Live Services.

A web site is harder to get for free, especially if you do not want it to be littered with commercial messages that have nothing to do with your company. The solution I found is a free web site from Azure. See here. In fact, I have two sites there, (how versatile) one Silverlight site and an HTML5 site – that downloads video’s from my (free) Skydrive. But do we really need a web site? It seems a bit old school. You could also have a free blog at e.g. WordPress.com, yes right were you are reading this now, and /or a Page at FaceBook.

Clearly, the more you search for it, the more free services you find. If you are a developer, there are several possibilities to safely store your code. Open Source code can go to GitHub, or SourceForge, or of course Microsoft’s CodePlex. My company is at Codeplex, right here. For proprietary software you can use Team Foundation Service (free for up to 5 users – contributors per repository, I suppose).

A GPU Bilateral Filter Implementation

This post reports on a bilateral filter implementation that improves processing time from 32ms to 0.25ms.

Introduction

The Kinect (for Windows) depth data are subject to some uncertainty that comes with its resolution. Depth estimates are defined in millimeters, and typically, subsequent depth measurements by the Kinect vary by a fixed amount.

Consider the graphs below. The x-axis counts the number of measurements, the y-axis represents distance measurements of a single point. The top graph shows connected dots, the lower graph shows

just the dots.

De graphs show two tendencies. One is that variance is one unit above, or one unit below the average practically all of the time, the second tendency is that the average changes a bit before it stabilizes. Here we see it change from about 3.76m via 3.8m to about 3.84m.

If the Kinect depth data is projected onto an image this variation translates into a nervous jitter. Since I do not particularly care for a nervous jitter, I would like to stabilize the depth data a bit.

Stabilizing Kinect Depth Data – Temporal Approach

The Kinect for Windows SDK (1.6) contains a whitepaper on skeletal joint smoothing. The paper deals with the reduction of noise in the Kinect skeletal tracking system. This tracking system employs the same depth data, and therefore suffers from the same problem.

The proposed solution is to filter the data over time. The depth measurement z(x,y)(t) of a location (x, y) at time t can be averaged over a number of measurements in the past at the same location: z(x,y)(t-i) where i is in [1, n]. The suggestion is to take n not too large, say 5.

Averaging can also be over measurements in the future. This implies that one or two frames are included in averaging before an image based on the depth image is rendered, hence there is a latency in rendering equal to the number of ‘future’ frames included in averaging. The advantage of considering the ‘future’ is that if the measured scene changes (or a player changes position – in skeletal tracking), another type of averaging can be applied, one that is better suited for changes and e.g. puts a heavier weight on recent measurements.

I’ve done an experiment with temporal filtering, but it was not satisfactory. The fast and nervous jitter just turns into a slower one that is even more disturbing because short periods of stability make changes seem more abrupt.

Stabilizing Kinect Depth Data – Spatial Approach

Another approach is not to average over measurements at the same location through time, but to average within one frame, over several proximate measurements. A standard solution for this kind of filtering is the Bilateral filter. The Bilateral Filter is generally attributed to Carlo Tomasi and Roberto Manduchi. But see this site where it is explained that there were several independent discoveries.

The idea behind the Bilateral Filter is that the weight of a measurement in the average is a Gaussian function of both the distance and the similarity (in color, intensity, or as in our case: depth value). The similarity term prevents edges to be ‘averaged out’.

The Bilateral Filter works well, the only drawback it has is its computational complexity: O(N^2) where N is the (large!) number of pixels in the image. So, several people have been working on fast algorithms to alleviate the computational burden. To me it seems that Ben Weiss provided a good solution, but it is not generally available. The solution by Frédo Durand and Julie Dorsey (2002), and the elaboration of this work by Sylvain Paris and Frédo Durand (2006), all from MIT, seems to be the leading solution, and is general available – both the theory and example software. Their method has a project site that is here.

In a nutshell, the method by Sylvain Paris and Frédo Durand reduces processing time by first down sampling the image, then applying a convolution to compute the averages, and finally scaling up the image again while clamping over out-of-bounds values. So in essence, it operates on a (cleverly) reduced version of the image.

I’ve downloaded and compiled the software – the really fast version with the truncated kernel – and it requires about 0.032s to process a ppm image of 640×480 pixels (grayscale values), where the spatial neighborhood is set to 16 (pixels) and the ‘similarity’ neighborhood is set to 0.1, so grayscale colors that differ more than 0.1 after transformation to normalized double representation, are not considered in the average. See the image below for a screen shot.

The processing time is, of course, computer dependent, but my pc is not really slow. Although 32ms is a fine performance, it is too slow for real-time image processing. The Kinect produces a frame 30 times per second, i.e. every 33ms, and we do not want to create a latency of about one frame just because of the Bilateral Filter.

GPU implementation: C++ AMP

In order to improve on the processing time of this fast algorithm I’ve written a C++ AMP program inspired by the CPU implementation, this program runs on the GPU, instead of on the CPU. For information on C++ AMP, see here and here. What I think is great about AMP is that it provides a completely general access to General Purpose GPU computing. Having said that, I must also warn the reader that I do not master it to the degree that I could guarantee that my implementation of the Bilateral Filter in C++ AMP is representative of what could be achieved with C++ AMP.

The result of my efforts is that the ppm image above can now be processed in little over 1 ms. Consider

the picture below, made with my ATI Radeon HD 5700 Graphics card.

What you see here is a variety of timings of the computational phases. The top cycle takes 1.1ms, the middle one takes 1.19, and the bottom cycle takes 1.07ms. So, what is in the cycle?

1. The image is loaded into the GPU, and data structures are initialized. If you want to know more on ‘warming up’ the data and the code, see here. Since it takes 0.5 to 0.6 ms it is obviously the bottle neck.

2. Down sampling the image to a smaller version takes around 0.1 ms.

3. Computing the convolution takes 0.35 ms. This is the real work.

4. Up scaling and clamping takes again 0.1 ms.

A processing time of about 1 ms is satisfactory as a real-time processing time. Moreover, since we may assume the data is already in GPU memory (we need it there to render it to the screen), GPU upload time is not an attribute of an application of the Bilateral Filter in this context. So we may think of the processing time as being about 0.55 ms. which is absolutely fabulous.

New Graphics Card

At about this time, I bought a new graphics card, an Asus NVidia GTX690 (which for the purposes of this application yields the same results as a GTX 680, I know). This card was installed in my pc. Ok, I didn’t buy a new motherboard, so data is still being uploaded through PCI-e 2.0 and not through PCI-e 3.0 16x (but in time…). So, will this make a difference? Yes, it does. Look at the screen shot below.

I rearranged the timings a bit, to gain better oversight. We see that:

1. Data uploading and the warming up process now takes about 0.45 ms.

2. Filtering now takes about 0.25 ms.

From 32ms to 0.25ms. Most satisfying!