Posts Tagged ‘Silverlight’

PInvoking DirectX from Silverlight

Before moving on to Windows 8 development, I decided to write some legacy software. Well actually, this legacy software is perfectly up-to-date Windows 7 level software; tricks presented here will be useful for years to come. It’s just that Windows 8 (Consumer Preview) provides standard solutions to the problems solved here. This blog post discusses the use of a DirectX application, packaged as a DLL, by a Silverlight application, via PInvoke.

The problems tackled here stem from the desire to have Rich Internet Applications (RIAs) for Windows, that use computational resources on the client computer. In particular DirectX for 3D-graphics, X3dAudio, for 3D-audio, and also the GPU (Graphics Processing Unit – a powerful, highly parallel processor). Silverlight provides the facilities to write RIAs, but has a somewhat outdated 3D-graphics library: a subset of XNA – a managed wrapper for DirectX9 (but we want DirectX11, at least!). This Silverlight 3D-graphics library is not very extensive, it lacks e.g. 3D-audio.

On the other hand, Silverlight does provide facilities for interoperability with native code, e.g. by means of PInvoke: the invocation of native code in Dynamic Link Libraries (DLLs). PInvoke is here the bridge between Silverlight and DirectX code.

This blog post presents:

  • A sample DirectX11 application, and its transformation into a DLL to be used from Silverlight.
  • A Silverlight application that calls methods in the dll.
  • How to install and uninstall the DLL, and how to manage its lifetime explicitly, so the DLL may be uninstalled by the Silverlight application itself.
  • Performance aspects of the Silverlight-DirectX application, and a comparison with a Silverlight application that uses the Silverlight 3D-graphics library for the same task.
  • Concluding remarks, for one thing that this application should have had 3D-audio to decisively mark the advantage of the approach presented here (but at some point, you just have to round up).

The DirectX 11 Sample Application

The DirectX 11 Tutorial05 sample application will serve as the application a user wants to run on his or hers PC, and that uses resources already present on that PC. This DirectX application is the most simple application that contains some animation, and it has also a part – the small cube – that we can multiply in order to generate data for various performance loads.

To that end we transform it into a DLL with as much unnecessary functionality stripped, and an adequate interface added, including the code to transfer the data we need in the Silverlight application. Let’s take a look at the main changes.

Minimizing Window Management Code

For starters, We do not need a window, we use the DirectX application only to compute the 3D-graphics we present in the Silverlight application. The wWinMain (application entry point) function now looks like this:

Sample code like above is entered into the text as pictures. If you would like to have the code, just leave a comment on this blog with an e-mail address and I will ship it to you.

The function has no “Windows” parameters any more, nor has it a main message loop. The InitWindow function has been reduced to:

We do need to create a window in order to create a swap chain, and only for that reason, so we keep it as simple and small as possible. Note that the wcex.lpfnWndProc is assigned the DefWindowProc. That is: the application has no WindowProc of its own.

Create Texture to be Used in Export

In order to export the 3D-graphics data, an additional texture (a texture is a pixel array) called g_pOutputImage is created in the InitDevice function:

This texture has usage “Staging”, so the CPU can access it, and we specified CPU access as “Read”. With these settings we can’t bind the texture to the DeviceContext anymore, so no BindFlags. Note that we cannot have a texture that the GPU writes to, and the CPU reads from. If that would have been possible we could have had a data structure that both DirectX and Silverlight could have used directly. Since this is impossible we will have to perform expensive copy operations. Alas.

A final change in this same function is that we do not release the pointer to the back buffer, but keep it alive in order to export the graphics data in the Render function.

Rendering 3D-Graphics

The Render function has a loop added so we can have multiple small cubes. The idea is to compute a World matrix for each additional small cube. That is, we have only one cube, but draw it multiple times at different locations. Like this:

and:

Converting and Exporting 3D-Graphics Data

Finally, we want to copy the 3D-graphics data into an array the Silverlight client has provided, so that the client can show it to the user. This is done like so:

The above is standard code, I obtained it around here (the direct link seems broken). The ConvertToARGB function, however is a custom addition, replacing the memcpy call (more about that in the section on performance). This ConvertToARGB converts the RGBA format of DirectX to the premultiplied (PM) ARGB format used in Silverlight. This PM ARGB format is considered legacy now. The conversion step is a real performance hit as anyone can imagine. The function looks like this:

Essentially this OR-s 4 integers, the first one is constructed by byte-shifting the A (transparency) byte all to the left, then 3 integers are created by pushing the RGB bytes in place. This is a fast algorithm since shifting is a quick operation. I found it here. After the conversion, the pixels are in the correct format in an array that is owned by the Silverlight client application.

The DLL Interface

The interface has the following methods:

And for performance measurements:

The above functions return an average time over the Render function, and an average time over the conversion and export respectively. Details will be discussed below. The

decoration results in a clean export of the function names. Without the decoration, the C++ compiler will add a number of tokens (among which at least a few like @#$%^&*) to the function name in order to make it unique. The problem with this is that you’ll have a hard time retrieving the actual function name for use in the Silverlight client.

The Silverlight Client

General Architecture

The application has the following structure:

The App class is the application entry point (as usual). The Application_Startup event handler, depicted below,

first checks if the application is running out-of-browser (OOB). Running OOB is the intended normal use of this application. If so, a MainPage control is instantiated which will run the DirectX code. If the application is running in-browser, it still needs to be installed. Only after installation, the application has access to the file system – required to save and load the dll – and to the GPU. The application requires Windows 7 or higher and bails out if a lower level Windows or Apple OS is found.

The install page offers to install the application on the user’s PC, as depicted below,

or tells the user that the application is already installed, and hints at ways to uninstall the application if so desired.

If the user installs the application, it starts running out of browser and shows the MainPage with the DirectX animation.

Installing, Uninstalling, and Managing DLL Lifetimes

Installing includes saving the DirectX application in the DLL to a file on the user’s PC. The DLL is packaged with the Silverlight application as a resource. For execution, the DLL has to be loaded in memory, or be present on the PC as a file. Saving the DLL to file is done with code after an example from the NESL application. We store the application at “<SystemDrive>ProgramDataRealManMonths PInvokeDirectXTutorial05”.

Once the DLL is saved to file we load it into memory using the LoadLibrary function from the kernel32.dll. The reason we manage the dll’s lifetime explicitly instead of implicitly by importing the dll, and calling its functions, is that we need to be able to explicitly remove the dll from memory when exiting the application, see below. Loading into memory requires a dll import declaration:

And a call of this function, in the MainPage_Loaded event handler:

Where DllPath is just the path specified above. Is that all? Yes, that’s all.

When the application is exited, we use the handleToDll to release the library with repeated calls to FreeLibrary. Declaration:

Then we call it in the Application_Exit event handler as follows:

The point is that each method we import from the dll increases the reference count. As long as the reference count is larger than zero we cannot unload the DLL, nor delete its file. Not being able to delete the file means we cannot properly uninstall the application – we would leave a mess. Once the ref count is zero, FreeLibrary unloads the library from memory.

The final question in this section is why we delete the dll file every time we exit the application, and create the file every time we start it up. The reason is that if we do not do that, and the user uninstalls the application from the InstallPage (running in-browser), the application does not have the permissions to access the file system, hence the DLL file will not be deleted. So, all these file manipulations are bound to the runtime of the application in order to have a clean install and uninstall experience for the user.

PInvoking the DirectX Functions

Now that the application can be installed, functions from the DirectX application interface can be declared and executed.

[DllImport(DLL_NAME, SetLastError = true, CallingConvention = CallingConvention.Cdecl)]

public extern static int Init(int width, int height,

[MarshalAs(UnmanagedType.LPWStr)] String effectFilePath);

[DllImport(DLL_NAME, SetLastError = true, CallingConvention = CallingConvention.Cdecl)]

public extern static void Render([In, Out] int[] array);

[DllImport(DLL_NAME, SetLastError = true,CallingConvention = CallingConvention.Cdecl)]

public extern static int Cleanup();

[DllImport(DLL_NAME, SetLastError = true, CallingConvention = CallingConvention.Cdecl)]

public extern static void GetRenderTimerAv(ref double pArOut);

[DllImport(DLL_NAME, SetLastError = true, CallingConvention = CallingConvention.Cdecl)]

public extern static void GetTransferTimerAv(ref double pArOut);

We make a call to the Init function in the MainPage_Loaded event handler, calls to the dll Render function, in the local Render method, and a call to CleanUp in the Application_Exit event handler.

Calls to the timer functions are made when the user clicks the “Get Timing Av” button on the MainPage.

Debugging PInvoke DLLs

At times you may want to trace the flow of control from the Silverlight client application into the native code of the DLL. This, however is not possible in Silverlight. Silverlight projects have no option to enable debugging native code. Manually editing the project file doesn’t help at this point. Now what?

A work around is to create a Windows Presentation Foundation (WPF) client. I did this for the current application. This WPF application does not show the graphics data the DirectX library returns, it just gets an array of integers.

To trace the flow of control into the DLL you need to uncheck ”Enable Just My Code (Managed only)” at (in the menu bar) Tools | Options| Debugging, and to check (in the project properties) “Enable native code debugging” at Properties | Debug | Enable Debuggers.

If you now set a breakpoint in the native code and start debugging from the WPF application, program execution stops at your breakpoint.

Reactive Extensions

In order to have a stable program execution, the calls to the dll’s Render method are made on a worker thread. We use two WriteableBitmaps, one is returned to the UI thread upon entering the Silverlight method that calls the dll’s Render method, the other WriteableBitmap is then rendered to by the DLL. After rendering, the worker thread pauses to fill up a time slot of 16.67ms (60 fps).

Thread management and processing the indices that point into the WriteableBitmap array (implementation detail J ) is done using Reactive Extensions (RX). The idea is that the stream of indices the method returns is interpreted by RX as an Observable collection and ‘observed’ such that it takes the last index upon arrival, and uses the index to render the corresponding WriteableBitmap to screen. This results in elegant and clean code, as presented below.

The first statement create an observable collection from a method that returns an IEnumerable. Note that ‘observing’ is on the UI thread (referred to by the ‘DispatcherScheduler’)

The SubscribeOn(ScheduleNewThread)-clause creates a new thread for the render process. The lambda expression defines the action if a new int (index) is observed.

Rendering on the worker thread proceeds as follows:

To stop rendering we just put IsRunning to “false”. And that’s it.

Performance

DirectX applications – by definition – have higher performance than .Net applications. However, if you pull out the data from a DirectX application and send it elsewhere, there is a performance penalty. You will be doing something like this:

CPU -> GPU -> CPU -> GPU -> Screen instead of CPU -> GPU -> Screen

The extra actions: copying data from the GPU to CPU accessible memory and converting to Premultiplied ARGB will take time. So the questions are:

  1. How much time is involved in these actions?
  2. Will the extra required time pose a problem?
  3. How does performance compare to the Silverlight 3D-graphics library?
  4. Are there space (footprint) consequences as well?

Before we dive into answering the questions, note that:

– The use of DirectX will be primarily motivated by the need to use features that are not present in the Silverlight 3D-Graphics / -Audio library at all. In such cases comparative performance is not at all relevant. Performance is relevant if the use of DirectX becomes prohibitively slow.

For the measurements I let the system run without fixed frequency; usually you would let the system run at a frequency of 60Hz, since this is fast enough to make animations fluent. At top speed, the frequency is typically around 110Hz. I found no significant performance differences between debug builds and release builds.

Visual Studio 11CP Performance analysis: Sampling

If we run a sampling performance analysis – this involves the CPU only, the bottleneck in the process becomes clear immediately: The conversion from RGBA to premultiplied ARGB (and I’m not even pre-multiplying) takes 96.5% of CPU time.

It is, of course, disturbing that the bulk of the time is spent in some stupid conversion. On the other hand, work done by the GPU is not considered here.

To investigate the contribution of the conversion further, I replaced the conversion by a memcpy call. Then we get a different color palette J, like this:

But look, the frequency jumps up to 185 fps (80% more). The analysis then yields:

That is: much improved results, but shoving data around is still the main time consumer. Note that the change of color palette by the crude reinterpretation of the pixel array is a problem we could solve at compile time, by pro-actively re-coloring the assets.

Compare to a Silverlight 5 3D-library application

Would the performance of our application hold up to the performance of a Silverlight application using the regular 3D-graphics library? To find out I transformed the standard Silverlight 3D-graphics starter application to a functional equivalent of our Silverlight-DirectX application, as depicted below – one large cube and 5 small cubes orbiting around it (yes, one small cube is hidden behind the large one).

If we click the “Get Timing Av button”, we typically get a “Client Time Average” (average time per Draw event handler call) of 16.6.. ms, corresponding to the 60 fps. The time it takes to actually render the scene averages to 3.3 ms. This latter time is 0.8ms without conversion, and 2.8ms with conversion for the Silverlight – DirectX application (if we let it run at max frequency). So, the Silverlight-DirectX alternative can be regarded as quicker.

If we look at the footprint, we see that the Silverlight-DirectX application uses 1,880K of video memory, and has an image of 50,048K in the Task Manager. The regular Silverlight application uses 5,883K of video memory, and has a 37,708K image. Both in SLlauncher. So, the regular Silverlight application is smaller.

Concluding Remarks

For one, it is feasible to use DirectX from Silverlight. PInvoke is a useful way to bridge the gap. This opens up the road to use of more, if not all, parts of the DirectX libraries. In the example studied here, the Silverlight-DirectX application is faster, but has a larger footprint.

We can provide the user with a clean install and uninstall experience that covers handling and lifetime management of the native dll.

Threading can be well covered with Reactive extensions.

There is a demo application here. This application requires the installation of the DirectX 11 and the Visual C++ 2010SP1 runtime packages (links are provided at the demo application site). I’ve kept these prerequisites separate, instead of integrating their deployment in the demo application installation the NESL way, mainly because the DirectX runtime package has no uninstaller.

If you would like to have the source code for the example program, just create a comment on this blog to request for the source code, I’ll send it to you if you provide an e-mail address.

Glassy Buttons on a Magnifier

Glassy buttons, or other controls, have a great appeal because of their transparent and shiny character. They remind us of gems. In user interfaces, the transparent character is also of great use in case you need to see what is beneath the control.

Recently, I dug up the Silverlight Glass Button tutorial by Martin Grayson. This glass button consists not only of a very fine looking exterior, but also has sophisticated animated behavior. An overview of the steps to create such a button is as follows below.

Creating the Glassy Button

  1. Create project in Blend
  2. Set the root background to a gradient color that blends well with the color of the glass button.
  3. Create a button and start editing the template (empty template option)
  4. Add a Border control to the template: the button’s Outer Border
    1. Add a second border, within the outer border: the Inner Border. Set the backgroundColor to Black, and the Alpha to 0.5.
    2. Add Grid to Inner Border with a row divider to middle of the grid
    3. Add border to top row: the Shine. Remove the margins, set the V / H alignment to stretch, and set the background to linear gradient from top to bottom. Set the Black gradient stop to Alpha = 60%, color=white. Set Other gradient stop to Alpha = 20%.
  5. Add ContentPresenter to the center of the Button. Set it between Shine and Glow (see below). Set Foreground color to white (or other color that stands out).

This is the standard look of the button. Now add a blue Vista glow for the MouseOver event.

  1. Add another border, called Glow to the Grid, RowSpan = 2. Width=Auto, H/V align = stretch. The Background is a RadialGradientBrush. Set the left gradient stop to a light blue (141, 189, 255) with alpha: 70%, and the right stop to the same color with alpha = 0. Expand the gradient beyond the bounds of the button to look like the rising sun in the top row. Set the glow behind the Shine. Set opacity of the control to 0, animation that will show it will be added later
  2. Animate the MouseOver state transition by moving the Glow opacity in 0.33 s to 1.0, including revert.
  3. Animate the button Press state transition by hiding the Glow (Visibility:Hidden), setting the Opacity of the Shine to 40%, and settting the Background Border opacity from 80% to 50%.

That’s it. Of Course, the color of the button is arbitrary, as is the color of the Shine and Glow. This idea could be extended by providing an extra bottom layer onto which you could project items. Possible items are an image, a video, a Dropshadow of the text in the button’s Content control, etc. Another extension might be to apply the inner shadow by Samual Jack.

The Magnifier control

The glassy button has been used in a magnifier control. A magnifier can be used to enlarge pieces of the client window in order to study it with more precision. This specific magnifier has five buttons: enlarge what is under the magnifying glass, reduce it, reset it to normal, and then: flip the image horizontally or vertically (toggles). This last functionality may come in handy if you want to study rotated / mirrored images.

Note that this is the second time that we encounter the need for a jog / shuttle control: to be able to rotate the magnified image to arbitrary angle, apart from mirroring, would be an improvement. In this specific application we would use the jog / shuttle control as a rim around the magnifying glass.

Okay, back to the Glassy Buttons. They fit nicely in this design since you can look through them when positioning the magnifier. The outer border of the magnifying glass has been transparently colored using a rainbow like gradient so you can find it back, also in a multicolored image like the one in the example App.

In order to position the magnifier, you can drag it around by the rim. The functionality was added using the Expression Blend MouseDragElementBehavior. Although this behavior doesn’t work for Buttons, it works for controls that contain Buttons.

The magnifier uses two WriteableBitmaps. One to hold an image of the LayoutRoot, and one to hold the image of the magnifier. Copying is done using the Clear and the Blit function in the WriteableBitmap Extensions. Of course, before copying the visible rectangle has to be determined, for situations in which the magnifier is only partly visible (off screen / covered).

A specific problem solved was how to initially get the bitmap of the LayoutRoot. If the image is loaded from the hosting web site, and is not present in the xap itself, the LayoutRoot bitmap remains empty, until created after the first rendering. For all clarity, the creation of the LayoutRoot bitmap is located in the SizeChanged event handler. So, we have the initial LayoutRoot bitmap created the first time the magnifier bitmap is created – code guarded by a Boolean flag :-(.

The magnification itself, as well as the horizontal and vertical flips is done by means of a simple pixel shader.

Glossy Button

The MS Expression Blend tutorials contain a tutorial on creating a glossy button. Well, I like shiny things, so I got into it. The tutorial is very practically oriented, so in this blog post we’ll start with some remarks on lighting practices that might create the illusion of a shiny object on your screen.

Lighting

With respect to lighting concepts, I will somewhat follow Frank D. Luna here. Factors at play in lighting are directional and wavelength attributes of the light(s) involved and reflective properties of the material the light.

Light

For people, colored light is a compound of Red, Green and Blue light in varying intensities. The relative intensities of these components define the colors we know.

Ambient lighting is lighting of an object by indirect light, light that has been reflected multiple times, and thus has no specific source. In Ambient lighting models, neither the position of the observer nor the position of the light source plays a role. You might think of ambient light as showing the color of an object (dependent on the color of the light, of course).

Diffuse lighting of an object describes the scattering reflections of light coming from a specific source. The smoother the surface, the less diffuse its reflection, the shinier it looks. In Diffuse lighting models, the position of the light source, but not the position of the observer is defined. We could say that diffuse lighting creates a gloss on smooth surfaces.

Specular lighting of an object involves directed light that reflects (mostly) in a specific direction – ‘a cone of reflection’. In Specular lighting both the position of the light source and the observer is relevant – the observer might not see the specular reflection. We might say that specular lighting creates the shine on a smooth surface.

Light sources come in three flavors: parallel light sources, like the sun; point light sources, like a light bulb, and spotlights, like a flashlight.  In 3D models, parallel light has a direction but no source location; light from a point source has a source position and an
intensity that decreases quadratically with distance from the source. It lights objects in all directions. Spotlights have a source position, a specific direction and also attenuation. According to Lambert’s Cosine Law, reflected light intensity depends as the cosine function on the angle at which the light hits a surface, so perceived reflected light intensity depends on both the angle and the distance.

Material

Material properties define which wavelengths will be absorbed and which will be reflected (and to what extent). This defines the color and smoothness properties. Rough material will reflect light diffusely in all directions. If subjected to a strong light source, it will glow brightly, but will have hardly any gloss, and no shine. Conversely a smooth object will have a clear gloss and almost act as a mirror for the specular light (for an observer in an adequate position).

Modeling lighting of a glossy button

In a DirectX or XNA application you can model lighting extensively, and the result will be quite realistic. This realism comes at the price of significant resources, which cannot be spent on ‘just’ a button, so the same effects will have to be simulated by other means. In this section we will first analyze the construction of the glossy button from the tutorial, and then add some more features – for fun and enlightenment.

The Glossy Button Tutorial starts out with an ellipse, the button, that has a gradient color (and a robust edge). The gradient thus covers both ambient and diffuse lighting. A second layer, the gloss, consists of an ellipse that has a white gradient color, running into transparency. It also has a mild blur effect. This ellipse covers about two thirds of the button. The final layer, the shine, is a third ellipse, a much smaller one that also has a gradient white color, moving into transparency. The result is pretty cool, isn’t it?

Had I already mentioned the drop shadow (bottom right)?

Well, although pretty cool, one may have some disturbing questions, like: ‘What exactly is the shape of this button?’, or, ‘Where does the light come from?’ The shape cannot be a sphere, for that the gloss extends too far to the bottom, or alternatively, the shine is too close to the top. Also, it is weird that the drop shadow is bottom right, while the shine is in the middle. I also do not think that this button is very shiny, it doesn’t look it has a top layer of glass, like really shiny things do (I will get into this really shiny stuff in another blog post). Finally, it is strange that the rim has the same color all around the button. So, the conclusion is that the ingredients may be there, but the recipe isn’t quite right.

I have tried to improve a bit on these shortcomings but in the eyes of the reader it might just as well have become worse, so read on ;- ).

The glossy button demo application

The demo application has a number of additional features, the simplest being a text on the button. Also, the size of the button is rather large here, but it can be set differently using Width and Height dependency properties without adversely affecting the visual properties or behavior of the control.

Color picker

The user can pick a color for the button using the color picker. The color you select is the bright color in the gradient. The application adds the dark end of the gradient itself. The rim is also set to the dark color derived from the selected color. The color picker control is part of the CodePlex Silverlight Contrib project.

Follow the mouse pointer

Although I’m not a fan of things on your desktop that follow your mouse pointer, I did provide the option here. It was either follow the mouse pointer or develop a kind of jog/shuttle control to move the three gradients over the surface of the button, and this is quicker. So what you see is that within a certain range around the button, the button gradient, the gloss and the shine all follow the mouse pointer. The button does seem to a spherical shape when you move around.

Proximity color effect

A more experimental addition is that when the mouse pointer comes close to the button, light intensity increases. This means also that for some lighter colors, the color changes if the mouse pointer is over the button. This would agree with the common experience that harsh white light dims or removes color.

Mouse Pressed visual state

When the left mouse button is pressed, a simple animation increases the size of the gloss and shine.

Concluding remarks

After some pondering about some nagging dissatisfaction with the results I realized that a really shiny button also reflects the objects nearby. Reflection is what really makes the difference. So, if you really want to have shiny buttons, you are in for true 3D models in you user experience. Although that seems a far cry, the general use of 3D modeled user interfaces seems to gain momentum now. Just note tendencies like the use of the Kinect, the integration of Silverlight and XNA, the use of 3D in CSS3 and Html5, and, of course 3D video – without glasses even.

On the other hand, shiny buttons that do not reflect are already part of the Apple UX. In that case you see that the buttons are more like colored glass. These subjects, glassy buttons, and 3D user interfaces, will be subjects of upcoming blog posts.

The Windows 8, HTML 5, and Silverlight Rumor Circus

In this blog post an overview of the recent wave of fear and anger across the internet concerning the future of .Net in Windows 8 (could be released Autumn 2012), and why it is all a storm in a teacup.

Where and when it started

It all started with the demo of Windows 8 on the AllThingsD Conference (June 1)by Julie Larson-Green and Steven Sinofsky who mentioned that the applications presented were, and further applications could written in Html5 and JavaScript (version 1.8.5, together with CSS 3.0 called the HTML technology stack). Throughout the demo there was no mention of Silverlight or WPF. “This, What Has Not Been Said” tapped into already slumbering fears that HTML5 will compete out Silverlight. Heated reactions followed in discussion groups. See e.g. this one, where the thread was closed by the moderator. Some people, clearly driven by a distinct dislike of the Microsoft company stirred up the fire, as did a journalist of a respected medium.

What it is about

The fear mentioned above is the fear of many developers that costly investments of time and effort will become useless with the release of Windows 8. Of course, if there is only the risk that .Net software would be legacy at the release of Windows 8, investments in .Net and Silverlight software would stop immediately. Not only is there fear, but also dislike. Some developers express the opinion that the HTML5 tech(nology) stack is inferior to Silverlight, and also tedious to work with.

Does it seem justified?

I myself doubt this fear and uproar is justified. The HTML5 tech stack consists of ‘standards’ that are not finished, and have implementations that diverge across browsers, thus forcing web application developers to provide multiple implementations of the same functionality – who would like to pay for that? I know people that sell over the web and implement their web shop in HTML version Long.Ago to guarantee broad accessibility; this is what the Browser Wars have accomplished – people do not like to invest in new versions of the Html tech stack. Microsoft will not make itself dependent on these ‘standards’ it does not control.

Furthermore, Silverlight can do things that the HTML5 tech stack in itself cannot, or never will be allowed to do, if only for security reasons, see e.g. Microsoft’s refusal to support WebGL). Would Microsoft suddenly turn around, embrace this technology and replace its own? Unlikely.

However, Microsoft also didn’t move to take away the fears; they have the Build Conference in September 2011 at which they will tell more. Nevertheless, one might expect some indirect damage control, and it came quickly.

Damage control – differing opinions

First there was the blog post by David Burela referring to analysis of a leaked early Windows 8 build. The conclusion seems to be that software that will be built on top of Windows 8, will be built with .Net and Xaml. A particular group of applications, called ‘immersive’ applications (running within the Windows Shell) can be build using Html5 and JavaScript, much like some types of applications for Vista.

Mary Jo Foley of ZDNet goes further and publishes parts of correspondence with developers who have actually analyzed the early Windows 8 build. The picture that arises from that article is that an improved version of the .Net runtime (referred to as the Windows Runtime) will be central in Windows 8, and is programmable by Xaml in concert with a wide variety of programming languages, i.e. it is like Silverlight / WPF. Within Windows 8, XNA can be used for 3D graphics. Windows 8 apps built using ‘.Net’ should be easy to port to other devices – phones, tablets, etc. say by recompiling and compensating for the form factor. The Html5 apps might depend on the Windows Runtime as well (MD: this would explain the little understood remarks from Microsoft about native support of Html5). WPF and Silverlight may cease to exist as such in Windows 8, but the constituent technologies will be there.

In conclusion, it seems as if Microsoft is creating the facilities to build apps in Windows 8 using the HYML5 tech stack, as an addition to the .Net framework, rather than as a replacement. The motivation to do so, by the way, is to attract more, new developers to the platform. It does not seem to be the case that ‘immersive’ application can be build only in Html5. It seems that ‘Immersive’ is just a namespace, defining an API that is required to build applications which run within the Windows 8 shell.

C++

OK, so software can be built using a variety of .Net languages, among which C++, and the runtime seems to be closer to the metal, thus providing higher performance, because immediate OS layers have been removed.

But there is more. C++ seems to be in what is called a ‘renaissance’. More developers use it in order to gain higher performance, a new specification (C++0x) is on its way, and C++ is recently declared by Google to be the best high performance programming language.

For the next version of Visual Studio (also to be released in 2012) Microsoft announced the AMP Accelerated Massive Parallelism library at the AMD Fusion Developer Summit. AMP promises to provide full C++ access to a heterogeneous set of processors and their memory models. That is to say: you write one program that executes both on a computer with GPU, as well as one without it. Note that GPU’s are not considered to be restricted to rendering graphics. These people consider a GPU a broadly applicable parallel processor (and indeed, there exist Graphics cards without a monitor connector). The demos shown by Sutter and Moth reflect awesome performance; over 1000GFlops.

AMP aims at extreme scalability of single executables, from very simple hardware architectures of a single core processor with dedicated RAM, up to extreme scaling out in Cloud configurations. Sutter showed the aimed for heterogeneity in running an executable on a pc with a multi core CPU with onboard GPU, and also a double separate GPU installed.

My guess is that all this nice stuff also will reflect on the ways software can be build with the evolution of .Net and Xaml on Windows 8.

Pixel Shader Based Expansion

Best animation performance in Silverlight 4 is obtained from the combination of procedural animation and a pixel shader, as reported in a previous blog. A pixel shader is not really meant to be used for spatial manipulation, but in Silverlight 4 vertex and geometry shaders are not available. Also, pixel shaders are limited to Model 2 shaders in Silverlight 4, and only limited data exchange is possible. The question is then, “how far can we push pixel shaders model 2”. Another previous blog post discussed a preliminary version of dispersion. This post is about Expansion. The effect is much like Dispersion, but the implementation is quite different – better if I may say so. Expansion means that a surface, in this case a playing video is divided up in blocks. These blocks then move toward, and seemingly beyond the edges of the surface.

The Pixel Shader

The pixel shader was created using Shazzam. The first step is to reduce the image within its drawing surface, thus creating room for expansion. Expansion is in fact just translation of all the blocks in distinct direction that depend on the location of the block relative to the center of the reduced image.

In the pixel shader below, parameter S is for scaling, reducing, the pixel surface. Parameter N defines the number of blocks along an axis. So if N = 20, expansion concerns 400 blocks. In the demo App I’ve set the maximum for N to 32, which results in 1024 blocks tops. Parameter D defines the distance over which the reduced image is expanded. If the maximum value for D is the number of blocks, all the blocks (if N is even), or all but the center block (N is odd) just ‘move off’ the surface.

The code has been commented extensively, so should be self explanatory. The big picture is that the reduced, centered and then translated location of the blocks is calculated. Then we test whether a texel is in a bloc, not in an inter block gap. If the test is positive, we sample the unreduced, uncentered and untranslated image for a value to assign to the texel.

sampler2D input : register(s0);

// new HLSL shader

/// <summary>Reduces (S: Scales) the image</summary>
/// <minValue>0.01</minValue>
/// <maxValue>1.0</maxValue>
/// <defaultValue>0.3</defaultValue>
float S : register(C0);

/// <summary>Number of blocks along the X or Y axis</summary>
/// <minValue>1</minValue>
/// <maxValue>10</maxValue>
/// <defaultValue>5</defaultValue>
float N : register(C2);

/// <summary>Displaces (d) the coordinate of the reduced image along the X or Y axis</summary>
/// <minValue>0.0</minValue>
/// <maxValue>10.0</maxValue> // Max should be N
/// <defaultValue>0.2</defaultValue>
float D : register(C1);

float4 main(float2 uv : TEXCOORD) : COLOR
{
	/* Helpers */
	float4 Background = {0.937f, 0.937f, 0.937f, 1.0f};
	// Length of the side of a block
	float L = S / N;
	// Total displacement within a Period, an inter block gap
	float d = D / N;
	// Period
	float P = L + d;
	// Offset. d is subtracted because a Period also holds a d
	float o = (1.0f - S - D - d) / 2.0f;
	// Minimum coord value
	float Min = d + o;
	// Maximum coord value
	float Max = S + D + o;

	// First filter out the texels that will not sample anyway
	if (uv.x >= Min && uv.x <= Max && uv.y >= Min && uv.y <= Max)
	{
		// i is the index of the block a texel belongs to
		float2 i = floor( float2( (Max - uv.x ) / P , (Max - uv.y ) / P  ));

		// iM is a kind of macro, reduces calculations.
		float2 iM = Max - i * P;
		// if a texel is in a centered block,
		// as opposed to a centered gap
		if (uv.x >= (iM.x - L) && uv.x <= iM.x &&
		    uv.y >= (iM.y - L) && uv.y <= iM.y)
		{
			// sample the source, but first undo the translation
			return tex2D(input, (uv - o - d * (N -i)) / S );
		}
		else
			return Background;
	}
	else
		return Background;
}

Client Application

The above shader is used in an App that runs a video fragment, and can be explored at my App Shop. The application has controls for image size, number of blocks, and distance. The video used in the application is a fragment of “Big Buck Bunny”, an open source animated video, made using only open source software tools.

Animation

Each of the above controls can be animated independently. Animation is implemented using Storyboards for the slider control values. Hence you‘ll see them slide during animation. The App is configured to take advantage of GPU acceleration, where possible. It really needs that for smooth animation. Also the maximum frame rate has been raised to 1000.

Performance Statistics

The animations run at about 225 FPS on my pc. This requires significant effort as from the CPU –about 50% of the processor time. The required memory approaches 2.3Mb.

Pixel Shader Based Panning

Best animation performance in Silverlight 4 is obtained from the combination of procedural animation and a pixel shader, as reported in a previous blog. I know, a pixel shader is not really meant to be used for spatial manipulation. However, in Silverlight 4 vertex and geometry shaders are not available. Also, pixel shaders are limited to Model 2 shaders, and only limited data exchange is possible. The question is then, “how far can we push pixel shaders model 2”. Another previous blog post discussed a preliminary version of dispersion. This post is about Panning. Panning means here that the size and coordinates of a sub frame of, in this case a video, are changed.

The Pixel Shader

The pixel shader was created using Shazzam. The first step is to reduce the image within its drawing surface, thus creating a frame that works as a window through which parts of the video are visible. Panning is in fact just changing the coordinates of this window while correcting for the change in coordinates when sampling the source texture.

In the pixel shader below, the BlockSize in [0, 1], BlockX, and BlockY, both in [0, 1] define the panning window. The Bound function centers the panning window for coordinate values of 0.5. In the Main function, the ‘space’ variable denotes the available space to move the panning window around in. BlockH and BlockV denote the topmost horizontal and leftmost vertical edges of the panning window. We use these to filter input coordinates that sample the color texture. Other texels get assigned the background color of the demo application.

sampler2D input : register(s0);

// new HLSL shader

/// <summary>Size of BLock</summary>
/// <minValue>0.0</minValue>
/// <maxValue>1.0</maxValue>
/// <defaultValue>1.0</defaultValue>
float BlockSize : register(C2);

/// <summary>Horizontal Block selection</summary>
/// <minValue>0.0</minValue>
/// <maxValue>1.0</maxValue>
/// <defaultValue>0.5</defaultValue>
float BlockX : register (C3);

/// <summary>Vertical Block selection</summary>
/// <minValue>0.0</minValue>
/// <maxValue>1.0</maxValue>
/// <defaultValue>0.5</defaultValue>
float BlockY : register (C4);

float2 Bounds(float coord, float space)
{
	// Scale coordinate to available space
	float blockStart = coord * space;
	return float2(blockStart, blockStart + BlockSize); // 2nd argument is block end
}

float4 main(float2 uv : TEXCOORD) : COLOR
{
	/* Helpers */
	// Background Color
	float4 Background = {0.937f, 0.937f, 0.937f, 1.0f};
	// Available space to move around
	float space = 1.0f - BlockSize;

	/* Define Block */
	float2 BlockH = Bounds(BlockX, space);
	float2 BlockV = Bounds(BlockY, space);

	// If uv in BLock, sample
	if (uv.x >= BlockH.x && uv.x <= BlockH.y &&
	    uv.y >= BlockV.x && uv.y <= BlockV.y)
		return tex2D(input, uv);
	else
		return Background;
}

Client Application

The above shader is used in an App that runs a video fragment, and can be explored at my App Shop. The application has controls for panning window size, X-coordinate and Y-coordinate of the panning window on the video surface. The video used in the application is a fragment of “Big Buck Bunny”, an open source animated video, made using only open source software tools.

Animation

Each of the above controls can be animated independently. Animation is implemented using Storyboards for the slider control values. Hence you‘ll see them slide during animation. The App is configured to take advantage of GPU acceleration, where possible. It really needs that for smooth animation. Also the maximum frame rate has been raised to 1000.

Performance Statistics

The animations run at about 270 FPS on my pc. This requires both significant effort from both the GPU as from the CPU – both about 35% of the processor time. The required memory approaches 2.2Mb.

Pixel Shader Based Translation

Best animation performance in Silverlight 4 is obtained from the combination of procedural animation and a pixel shader, as reported in a previous blog. I know, a pixel shader is not really meant to be used for spatial manipulation. However, in Silverlight 4 vertex and geometry shaders are not available. Also, pixel shaders are limited to Model 2 shaders, and only limited data exchange is possible. The question is then, “how far can we push pixel shaders model 2” Another previous blog post discussed a preliminary version of dispersion, which is fairly complicated. This post is about translation. Translation means here that the coordinates in the 2-dimensional plane of a reduced image, in this case a video, are changed.

The Pixel Shader

The pixel shader was created using Shazzam. The first step is to reduce the image within its drawing surface, thus creating the room required for translation. Translation is in fact just changing the coordinates of the reduced images while correcting for the change in coordinates when sampling the source texture.

The pixel shader below, the if statement does the filtering. Left and Right are parameterized offset and cutoff values. The offset ‘Left’ is also used to correct for the displacement in sampling. Sampling is done by the ‘tex2d’ function.

sampler2D input : register(s0);

// new HLSL shader

/// <summary>Reduces the image</summary>
/// <minValue>0.01</minValue>
/// <maxValue>1.0</maxValue>
/// <defaultValue>0.33</defaultValue>
float Scale : register(C0);

/// <summary>Changes the coordinate of the reduced image along the X axis</summary>
/// <minValue>0.0</minValue>
/// <maxValue>1.0</maxValue>
/// <defaultValue>0.5</defaultValue>
float OriginX : register(C1);

/// <summary>Changes the coordinate of the reduced image along the Y axis</summary>
/// <minValue>0.0</minValue>
/// <maxValue>1.0</maxValue>
/// <defaultValue>0.5</defaultValue>
float OriginY : register(C2);

float4 main(float2 uv : TEXCOORD) : COLOR
{
	float4 Background = {0.937f, 0.937f, 0.937f, 1.0f};
	float2 Origin = {OriginX, OriginY};
	// Reduce nr. of computations
	float halfScale = 0.5 * Scale;
	float2 Left = Origin - halfScale;
	float2 Right = Origin + halfScale;

	// Filter out texels that do not sample (the correct location)
	if (uv.x >= Left.x && uv.x <= Right.x && uv.y >= Left.y && uv.y <= Right.y)
		return tex2D(input, (uv - Left) / Scale);
	else
		return Background;
}

Client Application

The above shader is used in an App that runs a video fragmant, and can be explored
at my App Shop. The application has controls for video surface reduction, translation along the X-axis, and translation along the Y-axis. The video used in the application is a fragment of “Big Buck Bunny”, an open source animated video, made using only open source software tools.

Animation

Each of the above controls can be animated independently. Animation is implemented suing Storyboards for the slider control values. Hence you‘ll see them slide during animation. I didn’t use procedural animation, since, as discussed in the aforementioned previous blog post, is hindered by the implementation of the MediaElement.

The App is configured to take advantage of GPU acceleration, where possible. It really needs that for smooth animation. Also the maximum frame rate has been raised to 1000.

Performance Statistics

The animations run at about 250 FPS on my pc. This requires both significant effort from both the GPU as from the CPU. The required memory reaches a little over 3.5Mb.

Spatial Transformation by Pixel Shader

As reported in a previous blog, best animation performance in Silverlight 4 is obtained from the combination of procedural animation and a pixel shader. A pixel shader is meant to be used for the manipulation of colors and transparency, resulting also in lightning and structural effects that can be expressed pixel wise.

A pixel shader is not really meant to be used for spatial manipulation – spatial information is hard to come by, but see e.g. this discussion on the ShaderEffect.DdxUvDdyUvRegisterIndex property. Vertex shaders and geometry shaders exist to filter, scale, translate, rotate and filter vertices and topological primitives respectively. Another limitation of the use of shaders in Silverlight 4 is that only model 2 pixel shaders can be used, so that only 64 arithmetic slots are available. Finally, you can apply only one pixel shader to a UIElement.

In order to explore the limitations on spatial manipulation and the number of arithmetic slots, I decided to build a Silverlight 4 application that does some spatial manipulation of a playing video by means of procedural animation and a pixel shader. Well, this is also my first encounter with pixel shaders, so I gathered that a real challenge would show me many sides of pixel shaders (and indeed, no disappointments here).

The manipulation consists of two steps. First the video surface is reduced in size in order to create some space for the second step. In the second step, the video is divided up in a (large) number of rectangles which drift apart – disperse, while reducing in size. The second step is animated.

To be frank, I’m not very excited about the results so far. It definitely needs improvement, but I need some time and a well defined starting point for further elaboration.

The completed application

You can explore the completed application, called ‘Dispersion Demo’ in my App Shop. The application is controlled by 4 parameters:

  1. Initial Reduction. This parameter reduces the video surface. The higher the value, the greater the reduction.
  2. Dispersion. The Dispersion parameter controls the amount of dispersion of the blocks. If you set both Initial Reduction and Dispersion to 1, you can see the untouched video.
  3. Resolution. The Resolution is a measure for the number of blocks along both the X-axis, and the Y-axis.
  4. Duration. Controls the duration of the dispersion animation.

The application also has three buttons to Start, Stop, and Pause the animation of the Dispersion. The idea is that you first select an initial Reduction and a Resolution and then click Start to animate the Dispersion.

The video used in the application is a fragment of “Big Buck Bunny”, an open source animated video, made using only open source software tools (and an astonishing amount of talent).

Performance statistics

The use of some procedural animation and the pixel shader results on my pc in a frame rate of about 198 FPS, and a footprint of about 2Mb video memory. This is good (though ~400FPS would be exciting), and leaves plenty of time for other manipulations of the video image.

The pixel shader

The pixel shader was created using Shazzam, an absolutely fabulous tool, and perfect for the job. The pixel shader shown below centers the image or video snapshot and reduces it. Then it enlarges it again according to the Dispersion factor. The farther away from the center, the larger the gaps are between the blocks displaying video, and the smaller the blocks themselves are.

sampler2D input : register(s0);

// new HLSL shader

/// <summary>Reduces image to starting point.</summary>
/// <minValue>1.0</minValue>
/// <maxValue>20.0</maxValue>
/// <defaultValue>10.0</defaultValue>
float Scale : register(C0);

/// <summary>Expands / enlarges the image.</summary>
/// <minValue>1</minValue>
/// <maxValue>20</maxValue>
/// <defaultValue>6</defaultValue>
float Dispersion : register(C1); //Dispersion should never get bigger than Scale!!!

/// <summary>Measure for number of blocks</summary>
/// <minValue>100</minValue>
/// <maxValue>800</maxValue>
/// <defaultValue>200</defaultValue>
float Resolution : register(C2);

float4 main(float2 uv : TEXCOORD) : COLOR
{
       float ratio = Dispersion / Scale;
       float offset = (1 - ratio) / 2;
       float cutoff = 1 - offset;
       float center = 0.5;
       float4 background = {0.937f, 0.937f, 0.937f, 1.0f}; // Same as demo program's Grid

       float2 dispersed = (uv - offset)  / ratio;
       float2 limit = 4 * (abs(uv - center) * (Dispersion - 1) / (Scale - 1)) - 1;

       if(uv.x >= offset && uv.x <= cutoff && uv.y >= offset && uv.y <= cutoff)
       {
             if (sin(dispersed.x * Resolution) > limit.x &&
                 sin(dispersed.y * Resolution) > limit.y)
             {
                    return tex2D(input, dispersed);
             }
             else return background;
       }
       else return background;
}

In the code above, the outer “if” does the scaling and centering. The sine function and Limit variable take care of the gaps and block size. See the graph below, the Limit rises to cut off increasingly many values of the sine, thus creating smaller blocks and larger gaps.

Evaluation of limitations

Although this is a solution for the challenge set, it is not the solution. This solution uses scaling for dispersion, but what you really would like to have is a translation per block of pixels that disperse. Needing this reveals another limitation: in Silverlight 4 you cannot provide your pixel shader with an array of structs that define your blocks layout, hence it is very hard to identify the block a certain pixel belongs to, and thus what its translation vector is.

At this point I had the idea to code the block layout into a texture which is made available to the pixel shader (you can have up to three extra textures (PNG images) in your pixel shader. To create such a lookup map you have to create a writeable bitmap with the values you want, then convert this to a PNG image as Andy Beaulieu shows, using Joe Stegman’s PNG encoder. The hard part will be the encoding (and decoding in the shader). You have at your disposal 4 unsigned bytes, and you will have to encode 2D translation vectors that also consist of negative numbers. The encoding / decoding will involve a signed – unsigned conversion, and the encoding shorts (16 bit integers) as 2 bytes. This will, of course, be a feasible but tedious chore.

A better move would be to invest some effort in getting acquainted with Silverlight 5 Beta’s integration with XNA, by which also vertex shaders come available.

As far as the limit of 64 arithmetic slots concern, it seems that one absolutely must hit it. But then, it always seems possible to rethink and simplify your code, thus reducing its size. This limitation turned out not very prohibitive.

Procedural animation

The above pixel shader is integrated in the Silverlight application using the C# code generated by Shazzam. Then when implementing procedural animation, another limitation came up concerning manipulating videos.

The initial plan was to display the video using a collection of WriteableBitmaps that could be manipulated using the high performance extensions from the WriteableBitmapEx library. However, it turns out that to take a snapshot from a MediaElement, you have to create a new WriteableBitmap per snapshot. This is expensive since we now have to create at least about 30 WriteableBitmaps per second. Indeed, running the Dispersion application (at 198 FPS) sets the CPU load to about 50% at my pc. This is not really what we are looking for.

A way out is to implement the abstract MediaStreamSource class, both for the vision part as well as for the sound part of the video to be exposed. By the looks of it, this seems to be quite a challenge. However, Pete Brown has provided examples for both video and sound. And there is also the CodePlex project that provides the ManagedMediaHelpers. So, we add this challenge to the list.

Another experiment might be to see if the XNA sound and video facilities can be accessed when exploiting the Silverlight – XNA integration in Silverlight 5.

Conclusions

The challenge set turned out to be very instructive. I’ve learned a lot about writing pixel shaders, though there will be much more to learn. No doubt about that.

It is absolutely true that spatial manipulation is hard in pixel shaders :-), the main cause being the extraordinary effort required to provide the shader with sufficient data, or sufficiently elaborate data. The current shader needs improvement, the next challenge in writing shaders will address Silverlight 5 (beta) and XNA.

I noticed that you can achieve a lot in pixel shaders using trigonometry, and (no doubt) the math of signals.

Finally, in order to really do procedural animation on surfaces playing parts of a video, you seem to need to either implement MediaStreamSource or get use the XNA media facilities.

Silverlight massive animation performance

As it turns out, Storyboard animations in Silverlight have limited performance capability. Presumably this system has been designed for ease of use and developer / designer productivity. If you want to create massive amounts of animations, like for instance in particle systems, you soon hit the performance limits of the rendering, graphics, animation subsystem.

Hey, that’s interesting!

Of course, now we want to know what the performance limits are, and how we can get around them. When I first hit the aforementioned performance limits, I had no clue as to how to improve performance. In this article you will find some articles I found on the World Wide Web concerning the subject. Great stuff. Some solutions found are about 20 times faster than others, and current developments of Silverlight 5 seems to promise to take it a step further.

Growing trees

The first article encountered was How I let the trees grow by Peter Kuhn. He describes how he ran into performance problems creating a tree that grows by splitting branches into smaller branches, terminating in leaves. At some point he finds his software trying to render over 20k paths, which is ‘massive’ enough to create performance problems. The solution is found in the use of the WriteableBitmapEx CodePlex project. The WriteableBitmapEx contains (among others) a fast Blit operation for copy operations (claimed to be 20-30 times faster than the standard Silverlight operation – I believe it). You can draw on Bitmaps that are not shown yet, thus prepare images for the screen, and then quickly shove them into vision when ready. The (in browser – IE9) solution presented performs well.

What we do not get from this article are clear figures about standard Silverlight performance and improved performance. So let’s discuss another article.

Procedural animations

The WriteableBitmapEx CodePlex project contains a reference to Advanced Animation: Animating 15,000 Visuals in Silverlight by Eric Klimczak. He tells us that if we want to animate ~50 objects concurrently, we need additional performance measures over Storyboards and Timelines. The main performance measures he employs extend the ones mentioned above with: “procedural animations”.

In Procedural Animation within the context of Silverlight you code an Update() and Draw() Loop that is driven by the Windows.Media.CompositionTarget.Rendering event. Essentially, you now code the new position, color, or any other attribute, in the Update() method, and Blit it to the render target in the Draw() method – thus putting it on screen.

This works very well! Eric Klimczak has provided source code with his article, among which a program that animates moving particles that respond quickly to mouse actions (in browser).

For 3000 particles the program renders at ~200 frames per second (FPS), tops, and 15.000(!) particles are rendered at a still pleasant 36 – 46 FPS. I’ve used the Silverlight FPS counter for all Silverlight programs in this article in order to get comparable measurements. See the fps counter in the status bar of the IE screenshot below.

Curiously, there is no maxFrameRate setting in his code. About this maxFrameRate setting the Silverlight documentation writes:A value of approximately 60 frames-per-second should be reliable on all platforms. The 1000 and 30000 frames-per-second range is where the maximum frame rate could differ between platforms”. So, the obvious step is to set the maxFrameRate to 1.000 – both in code as well as in the html host, which showed a factor 3 performance increase compared to the original article software, for the 3K particle case (screenshot above). The Silverlight Documentation also states that the enableGPUAcceleration setting doesn’t work for the WriteableBitmap, so I skipped that one.

It seems to me that this approach solves most problems. However, procedural animation – a gaming software approach – opens the door to other, even more apt approaches.

Note that this approach does not employ the GPU. All rendering is done using the CPU.

Pixel shaders

An approach that takes performance a step further is Silverlight 3 WriteableBitmap Performance Follow-Up by René Schulte. In this article a number of approaches are compared. All approaches yield comparable results, except the pixel shader approach, which yields a factor ~20 better performance compared to the WriteableBitmap (WOW!).

How does it work? UIElement descendants have an Effect property. You can create custom Effects using a pixel shader written in HLSL (a .fx file) which you compile using e.g. fxc.exe – the DirectX HLSL compiler, or Shazzam. The compiled shader effect is loaded as a resource by a descendant of the ShaderEffect class. The article by René Schulte uses a custom derived class thereby showing how to transfer data into the shader during program execution. The loaded shader should be attached to the UIElement’s Effect Property. The shader will be executed for each pixel to be rendered. This gives you great control over the UIElement. You can modify many attributes of each pixel, for instance color and opacity. Do not forget that dropshadows are implemented as shaders, so you can also duplicate the UIElement visual.

According to René Schulte, the program / pixel shader is not executed on the GPU. That may have been true for Silverlight 3, but in Silverlight 4 it is absolutely possible to put the GPU to work. So, with a bit of tweaking the code here and there we find a maximum performance of >450 FPS.

I’ve registered the GPU invocation for specific tasks using the Catalyst utility of my graphics card, see the fields ‘GPU Clock’ and ‘Memory Clock’ at the bottom of the screen shot below. Regular values are 157 and 300 respectively.

Silverlight 5 Beta and XNA

Recently (April 13th 2011), Silverlight 5 Beta was released. It includes the DrawingSurface control which is a gateway into XNA functionality. A little experimenting reveals that like XNA the default drawing frequency is at 60 FPS, and you can’t seem to get it up by recurring calls to the OnDraw() event handler.

In Silverlight the frequency is raised as described above. In XNA the default of 60 FPS can be lifted by setting both the Game’s object ‘IsFixedTimeStep’ property and the GraphicsDeviceManager’s ‘SynchronizeWithVerticalRetrace’ property to false.

From the MIX demo video it is clear that performance is very good, however, at 60 FPS within Silverlight.  The performce step is ‘made’ by the shaders and realized on the GPU. It is currently not clear to me how to measure that performance, so this exercise ends here for now.

Non Silverlight performance

What performance can we expect? Is Silverlight slow, despite the extra tricks? What is the promise hidden in Silverlight 5? We now know that for demanding graphics we can turn to the integration of Silverlight with XNA. XNA, in turn is built upon DirectX.

Below you’ll find a screenshot of a DirectX11 particle demo. For 16K particles (reminiscent of the 15K in the above particle demo) we see a performance of ~620 FPS (not measured with the same frame counter as with the other programs, however), immediately requiring maximum performance from the GPU. For 8K particles performance rises to ~1175 FPS.

One conclusion I would like to draw here is that this performance correlates to the performance of the pixel shader used as a custom Effect. So, we may conclude that the real performance enhancement lies with the use of shaders.

Will this performance hold up in XNA? Yes, a particle simulation in XNA (from the XNA community, with small adaptation) brings us a ~1000 FPS performance, see screenshot.

Conclusion

The above is an exploration of techniques and approaches to realize massive animation performance in Silverlight. It is not a methodological, comparative study. A more rigorous investigation into performance (of what exactly?) might be subject for a later article that builds on the findings presented here.

Here we have learnt that in order to have massive animation in Silverlight we use the WriteableBitmap, the Blit operation from the WriteableBitmapEx Codeplex project, Procedural Animation programming, and pixel shaders (do not forget the enableGPUAcceleration setting, when applicable). We have seen that the exposure of XNA, built on DirectX, in Silverlight will most likely bring us further performance improvements.

Today we can have a very powerful massive animation performance of around 400-500 FPS, and the future is bright.

Rebuilding the App Shop

During the past couple of weeks I’ve rebuilt my App Shop, the portal that backs up posts in this blog with demo applications, where I keep solutions to problems I’ve built for the community, and other items of my portfolio. The App Shop is now a pure Silverlight application; you can access it only if you have Silverlight 4+ installed on your computer. This blog and the Silverlight community pages are the broadly accessible entrance to my work and portfolio.

So, anything worth mentioning about this rebuilt portal? Well, I like to think so. If one builds a portal like this, there are a number of requirements that need careful, integrated implementation. These requirements, concerning Compositionality, Navigation, MVVM implementation, and Blendability, will be discussed below.

Requirements

Compositionality

Reasons why you want your Silverlight application to be composed of small components that are downloaded selectively only if they are required (and cached after that) are abundant. The web server that hosts the App Shop, for instance, is located at my home. Measurements show that the upload speed from my home to an Internet Backbone is 0.8Mbits/s. So if you request the portal to show an App, that App just has to be small, or streamed, in order to provide an acceptable download experience.

Despite the slow upload connection to the World Wide Web, visitors of the portal should have a first view of the portal quickly. So, the portal has to start from a small initial component.

The App Shop also has to be easily extensible, where easy also means ‘without recompilation’. Since this is an App Shop you might expect that Apps will be added to it at a regular basis. Since this whole portal is about the Apps, and only about the portal itself in as far as it is a (going to be) showcase of a portal that I could build, it should not take much effort to add Apps to it.

A final compositionality requirement is that shared assemblies, such as the Silverlight System assemblies should be downloaded only once, then reused by further downloaded components.

Navigation

Navigation requires a master page and components one can load into the uniform environment the Master provides. In an extensible portal, you also want the navigation functionality to be extensible without compilation.

Blendability

Blendability means Editable in MS Expression Blend. This is a requirement I’ve learned from Laurent Bugnion. It was a major design theme in his MVVM-Light toolkit, and he is right. Blend is a fine design environment for Silverlight applications. It definitely allows you to scale up your design results. That is: while working in Blend, you generate an enormous and amorphous amount of XAML, so much one might wonder if it is feasible to do this all by hand, or in Visual Studio. Examination of the XAML code will show you in such cases that it is not obviously or overly bloated – some even call it efficient code.

So, being able to use Blend allows you to create at least one level higher up designs than if precluded its use by the design of your software system.

Implements MVVM

One way to preclude the use of Blend is to implement MVVM (Model View ViewModel) in a clumsy way. Connecting a ViewModel to a View as a resource that acts as the DataContext is a right way.

To be honest, I’ve had my doubts about this requirement. Implementing MVVM adds much extra complexity, just to be able to test the code automatically. Nevertheless, it’s good to have this technique in your developer’s toolbox. So, that’s why we have MVVM in this not very complex application.

Realization

MEF

Or, the Managed Extensibility Framework. This is really cool stuff. MEF is a very versatile compositionality framework. It works by attributing types as exports, that are subsequently discovered by MEF. Instantiations of exported types are inserted at variable declarations that are attributed as imports. For a real good introduction into MEF for Silverlight see the Hello MEF blog post series by Glenn Block.

Notably, a component that imports a class doesn’t need to know anything about the imported class. The import is guaranteed to implement a contract (interface) as specified by the importing class, or MEF will not insert it as an import. Components that are attributed as Exports can depend on Imports themselves. Developing software using MEF is a bit like Primordial Soup programming. You specify the interfaces or dependencies between the components and objects without wondering for a single moment how the required objects get at the desired locations.

Given the way MEF works, it can be applied in IoC / DI (Inversion of Control / Dependency Injection) scenario’s, for pure compositionality of course, for navigation in Silverlight, and for the implementation of MVVM. This last application of MEF relieves one of the task of manually maintaining the relation between a View and a ViewModel, while maintaining Blendability.

MVVM

As indicated above, MVVM is a very important software development pattern. Its main advantage is that it allows for automated unit testing, hence test driven development. A developer should have mastered this technique. Implementing MVVM is usually supported by a toolkit. The MVVM-Light toolkit by Laurent Bugnion is at the time of writing the dominant MVVM toolkit.

The MVVM-Light Toolkit is popular and respectable, but after having done some trials with it, I didn’t like it after all. The “why” lies mainly with the global ViewModel locator, an implementation of the Service Locator pattern. It is global, and as noticed by others, introduces a separate source of maintenance effort. John Papa and Glenn Block have provided an alternative based on MEF that I do have used. The MVVM-Light toolkit does, however, provide a facility to handle events generated by the View, in the ViewModel. This facility is the RelayCommand, an addition to the MVVM-Light Toolkit based on work by Josh Smith.

Ok, only ButtonBase children have commands. For events form other controls you can use the EventTrigger from the Interactivity library for e.g. the Loaded event of the UserControl like so.

<i:Interaction.Triggers>
    <i:EventTrigger EventName="Loaded">
        <i:InvokeCommandAction Command="{Binding NavigateTo}" commandParameter="NextPage"/>
   </i:EventTrigger>
</i:Interaction.Triggers>

The NavigateTo command referenced in the code above is implemented as a RelayCommand in the ViewModel. The road from the ViewModel to the View is always by DataBinding

Blendability

Attaching a ViewModel to a View as a resource preserves Blendability: the possibility to edit the View in Expression Blend, and finding in Blend that the ViewModel is attached as the DataSource and is accessible as such. When using a MEF ViewModel locator following the article by Papa and Block, Blendability is also preserved. Moreover, one of the goodies of using MEF is that the style defined for the App shop automatically permeates through to the Catalogue and its items, to the Master window, and to all the Apps loaded into the Master. Of course locally specified styles are preserved.

Architecture of the App Shop

So, now that we now about the requirements and the technology used, what does the structure of the App Shop looks like?

The portal consists of a main window that holds the logo, which acts as the Home button, and some small buttons, collectively referred to as the “small prints” at the bottom that concern legal aspects, contact information, and maintenance of visitor profiles (to be implemented 🙂 ).

This main window holds at any time one of two components:

– The App Shop Catalogue which presents the available Apps

– The App Master Window that provides a uniform context for all Apps in the shop.

Graphically:

Typically the user selects an App from the catalogue, by clicking on its icon, and the portal navigates to the Master window which loads the selected App. Information concerning the selected App is stored in the application level resource dictionary during transition, which (indeed) is abused as a kind of session variable, much like the Session in ASP.Net.

The Catalogue consists of a number of buttons, each holding an image and a caption identifying the App it represents. The images are loaded from a directory at the web server, the information about the Apps is loaded from an XML file, also at the web server.

To deploy an App, it suffices to add a catalog record to the XML file, to add an image to the image directory and to add the App to the App directory. No recompilation of the portal required. The App needs to have one or more MEF “Export” Attributes added, so has to be rebuilt.

When an App is selected from the catalogue, the Catalogue is unloaded, the Main window loads the Master, and the Master loads the App. The App is retrieved from the web server, at that point and not before. Apps that have been retrieved once are cached by the web browser, as are shared libraries. When preparing an App for use in the portal care is taken not to copy in libraries into the xap file that are already present at the client.

Quirks

The Binding System

The Blog Post by Papa and Block mentions a bug in the binding system – and provides a work around. The bug has not been resolved yet, at the time of writing this article. The articulation of the bug is incorrect, by the way. The text states that DataBinding will occur at most 6 times. In reality, DataBinding will occur exactly 6 times, for each binding (where 1 time would suffice).

Deployment

Although Apps can be added to the App Shop without rebuilding and redeploying the entire postal, I find myself doing just that nevertheless. The reason is that I like to keep the portal and Apps source code together in a single solution.

And Now…

Well, now it is easy to maintain and extend the portal, it is time to add Apps, of course, to Load the portal into Expression Blend and give it a much more sophisticated design and User Experience, and to add some more technical features. To be continued!