programming

Qt and OpenGL 3.3

Submitted by mimec on 2012-03-31

Some time ago I stumbled upon a great e-book on OpenGL programming: Learning Modern 3D Graphics Programming. The best thing about it is that it teaches the modern approach to graphics programming, based on OpenGL 3.3 with programmable shaders, and not the "fixed pipeline" known from OpenGL 1.x which is now considered obsolete. I already know a lot about vectors, matrices and all the basics, and I have some general idea about how shaders work, but this book describes everything in a very organized fashion and it allows me to broaden my knowledge.

When I first learned OpenGL over 10 years ago, it was all about a bunch of glBegin/glVertex/glEnd calls and that's how Grape3D, my first 3D graphics program, actually worked. Fraqtive, which also has a 3D mode, used the incredibly advanced technique of glVertexPointer and glDrawElements, which dates back to OpenGL 1.1.

A lot has changed since then. OpenGL 2.0 introduced shaders, but they were still closely tied to the fixed pipeline state objects, such as materials and lights. The idea was that shaders could be used when supported to improve graphical effects, for example by using per-pixel Phong lighting instead of Gouraud lighting provided by the fixed pipeline. Since many graphics cards didn't support shaders at that time, OpenGL would gracefully fall back to the fixed pipeline functionality, and everything would still be rendered correctly.

Nowadays all decent graphics cards support shaders, so in OpenGL 3.x the entire fixed pipeline became obsolete and using shaders is the only "right" way to go. There is even a special mode called the "Core profile" which enforces this by disabling all the old style API. This means that without a proper graphics chipset the program will simply no longer work. I don't consider this a big issue. All modern games require a chipset compatible with DirectX 10, so why should a program dedicated to rendering 3D graphics be any different? Functionally OpenGL 3.3 is more or less the equivalent of DirectX 10, so it seems like a reasonable choice.

I was happy to learn that Qt supports the Core profile, only to discover that it's not actually working because of an unresolved bug. Besides, the article mentions that "some drivers may incur a small performance penalty when using the Core profile". This sounds like a major WTF to me, because the whole idea of the Core profile was to simplify and optimize things, right? Anyway I decided to use OpenGL 3.3 without enforcing the Core profile for now, but to try to implement everything as if I was using that profile.

Another problem that I faced is that my laptop is three years old, and even though its graphics chipset is pretty good for that time (NVIDIA Quadro NVS 140M), I discovered that the OpenGL version was only 2.1. I couldn't find any newer drivers from Lenovo, so I installed the latest generic drivers from NVIDIA and now I have OpenGL 3.3. Yay! So I modified my Descend prototype to use shaders 3.30 and QGLBuffer objects (which are wrappers for Vertex Buffer Objects and Index Buffer Objects), but I will write more about it in the next post.

Adaptive Tessellation

Submitted by mimec on 2012-03-12

The Descend project is now officially reactivated and yesterday I committed the current version of the prototype into SVN repository. The UI is very basic, but it does its job of drawing a 3D surface based on mathematical equations. So far it consist of three important parts:

  • A compiler and interpreter of Misc, a programming language designed for calculating geometry with high performance. You can read more about it in the previous post and I will keep returning to it as it's quite an interesting subject.
  • An adaptive tessellator which is described below in more details.
  • A vertex and pixel shader that perform per-pixel Phong shading, which looks nice on the Barbie-pink surface :).

A parametric surface is described by a function V(p, q), where V is the vector describing the location of a point in 3D space and p and q are parameters (usually in [0, 1] or some other range). Obviously the surface consists of an infinite number of points, so we must calculate a number of samples and join the resulting points into triangles. This process is called tessellation. If the triangles are small enough, the Phong shading will create an illusion that the surface is smooth and curved.

The only difficulty is to determine what does "small enough" mean. Some surfaces are flat or almost flat and need just a few triangles to look good. Other surfaces are very curved and require thousands of triangles. In practice most surfaces are flatter is some areas and more curved in other areas. Take a sphere for example. It's curvature is the same everywhere, but we must remember that our samples are not distributed uniformly on its surface. Imagine a globe: meridians are located much closer to each other near the poles than near the equator. So in practice the distance between two samples located near the equator is greater and the surface needs to be divided into more triangles. This way, the size of all triangles will be more or less the same. Without adaptive tessellation, triangles would be closely packed near the pole and very large near the equator.

The tessellation algorithm works by first calculating four points at the corners of the surface. Wait, where does a sphere have corners? Just unwrap it mentally into a rectangular map, transforming it from (x, y, z) space into the (p, q) space. This gives us a square divided diagonally into two triangles. Then we calculate a point in the middle of the diagonal and divide each triangle into two smaller triangles. This process can be repeated recursively until the desired quality is reached.

How to measure the quality? The simplest method is to calculate the distance between the "new" point and the line that we are attempting to divide. The greater the distance, relatively to the length of the line, the more curved the surface. If this distance is smaller than some threshold value, we simply assume that the point lays on the line and discard it. The smaller the threshold, the more accurate the tessellation and the more triangles we get.

Unfortunately there are situations when this gives wrong results. If the curvature of the surface between two points resembles a sinusoid, then the third point in between appears to be located very near the line drawn between those two points. The tessellation algorithm will assume that the surface is not curved in this area. This produces very ugly artifacts.

So I came up with a method which produces much more accurate results. In order to render the surface with lighting, we need to calculate normal vectors at every point. For the Phong shading to look nice, those normals must be calculated very accurately. So two more points are calculated at a very small distance from the original one and the resulting triangle is used to calculate the normal. Note that the angle between normals is a very accurate measure of the curvature. An algorithm which compares the angle between the normals of two endpoints and the normal of the "new" point with a threshold angle can handle situations like the above much better. It's also more computationally expensive, because we must calculate three samples before we can decide if the point is rejected or not.

Of course this method can also be fooled in some specific cases, but in combination with the first one it works accurately in most cases. Experimentation shows that the threshold angle of 5° gives excellent results for every reasonable surface I was able to come up with.

In practice we also have to introduce the minimum and maximum number of divisions. Up to a certain point we simply keep dividing the grid into smaller triangles without even measuring the curvature, because otherwise the results would be very inaccurate. And since the curvature may be infinite in some points, we also must have some upper limit.

Final notes: Adaptive tessellation of parametric surfaces is the subject of many PhD dissertations and my algorithm is very simplistic, but it's just fast and accurate enough for the purposes of Descend. Also it should not be confused with adaptive tessellation of displacement mapping, which is a different concept.

Misc

Submitted by mimec on 2012-03-04

A long period since my last post indicates that a lot has happened. My son Adam was born on January 11th at about 3 a.m. He's changing a lot, already smiles, speaks various vowels and tries to catch objects and hold them in his hand. It's a great experience to be a father, although it requires a lot of patience. Well, it's never a bad time to learn something new, and I personally find it very inspiring :).

In the meantime I released an update for WebIssues, changed the layout of my web sites a bit and also made some improvements of Saladin (you can expect a new version soon). I also made a pretty good progress with my novel. I will write more about it shortly, but it's generally a story about hackers, magic and virtual worlds. The nature of my day job is rather destructive for my imagination (and for the amount of time I can spend on other things), but it turns out that I'm still able to use it and I have some interesting ideas from time to time.

I'm also very proud that after so many years I successfully revived the idea of Descend. In a nutshell, it's the successor of a program for drawing parametric surfaces which I wrote over 10 years ago. Last year I wrote an adaptive tessellator and I made a working prototype of the program using Qt's built-in JavaScript engine. It worked nicely, but it wasn't very fast. Qt uses the JavaScriptCore engine from WebKit, which is currently developed by Apple and powers the Safari browser. It is one of the fastest JS engines, with a register based vritual machine and built-in JIT compiler (you can find more interesting information about it in this post). However, JavaScript is an object oriented, dynamically typed language, best suited for creating complex AJAX-based applications. It's not optimized for number crunching.

So I designed my own language, which is statically typed and has only four types: boolean, float, 4D vector and 4x4 matrix. I called it Misc, although it doesn't have much to do with the old Misc engine that I wrote 7 years ago (which was dynamically typed and supported complex types like objects and arrays). I wrote a relatively simple compiler which produces bytecode and doesn't perform any fancy optimizations, and a stack based virtual machine with dedicated instructions for vector and matrix operations, trigonometric functions, etc. As soon as I got it working, I performed some benchmarks.

I created three versions of Descend: one based on JavaScriptCore, another one using the new Misc engine, and the third one with equations hard-coded in C++. All three versions were compiled using Visual C++ 2008 with full optimizations and link-time code generation. I measured the total time of generating a deformed sphere which consisted of 43,000 vertices and required calculating 135,000 samples (the tesselator needs 3 samples per vertex in order to accurately calculate normal vectors).

The results are very encouraging: the version using Misc is not only 5 times faster than the one based on JavaScriptCore, but also just 3.5 times slower than the one using native code! Obviously this doesn't mean that Misc code is almost as fast as native code, but the overhead added by the Misc interpreter is not very big, compared to the cost of the tessellator itself and the cost of calculating the trigonometric functions (which is the same for all three versions). In contrast, the overhead of JavaScriptCore is much more noticeable.

Surely there's still some room for improvement, but I doubt I would be able to squeeze more than a few CPU cycles without making the code significantly more complex. So I'd rather start working on the UI to make it possible to draw something more than a single surface. I will publish the first version once it's usable, which may take a few months depending on how much time I will be able to dedicate to it.

Status bar and elided label

Submitted by mimec on 2011-11-13

Many applications use a QStatusBar to display various information at the bottom of the window. The good thing about this widget is that it can contain various other widgets, not only labels, but also buttons, progress bars, etc. But in the most common case the status bar simply contains a few labels, and lays them out in a horizontal bar layout.

Everything is fine when the information displayed in the status bar is short and the window is large enough. However one thing that we should remember about the QLabel is that it can be expanded as necessary, but in the default configuration it cannot shrink — unless it has word wrapping or auto-scaling (in case of graphic content) enabled, its preferred size is also the minimum size. The result is that the window must be at least as wide as the total widths of all status labels, and the user cannot make it smaller. But in some cases the status text can be arbitrarily long — take a web browser which displays the URL of the hovered link in the status bar, and we cannot make any assumptions about the maximum length of the link.

Qt has a powerful mechanism of layouts which offers various solutions to this problem. One solution is to set the horizontal size policy of the label to Ignored. In that case both the preferred size and the minimum size is ignored. That's fine when the status bar contains only a single label. But in many cases, there are several additional labels on the right, that should only take as much space as they need, but should also shrink when there is not enough space. The solution is to keep the Preferred size policy, but also force the label to have zero minimum width by using setMinimumWidth. (Edited on 2012-03-23: actually it must be a small value greater than zero; setting minimum width to zero has no effect.)

There is still a minor issue — when the QLabel shrinks, it doesn't elide the text (by appending ... to indicate that the text was truncated), but simply clips it. The QLabel can be extended to support elided text quite easily, and there are many ways to do that. When I first did that, based on some code that I found, it worked nicely, until I placed a few such labels side by side in a status bar. I couldn't get the layout to behave correctly, and finally I found that this problem was related to the mechanism of eliding text.

So in case you need to put together an elided label on your own: do this by simply overriding the paintEvent, and do not try to mess with setText or resizeEvent, because that may break the automatically calculated preferred size of the label. Here's a code snippet:

void ElidedLabel::paintEvent( QPaintEvent* /*e*/ )
{
    QPainter painter( this );
    drawFrame( &painter );

    QRect cr = contentsRect();
    cr.adjust( margin(), margin(), -margin(), -margin() );

    QString fullText = text();

    if ( fullText != m_lastText || cr.width() != m_lastWidth ) {
        m_elidedText = fontMetrics().elidedText( fullText, Qt::ElideRight,
            cr.width() );
        m_lastText = fullText;
        m_lastWidth = cr.width();
    }

    QStyleOption opt;
    opt.initFrom( this );

    style()->drawItemText( &painter, cr, alignment(), opt.palette, isEnabled(),
        m_elidedText, foregroundRole() );
}

Note that this code caches the elided text for better performance, using the m_elidedText, m_lastText and m_lastWidth member variables. Also note that it always uses Qt::ElideRight — you can customize this if you need. Of course this will only work in case of a single-line, plain text with no special formatting.

Locking an SQLite database

Submitted by mimec on 2011-10-07

Today I will return to the topic of SQLite once again. This time I will discuss why and how to lock an SQLite database for exclusive access to make sure that it's not modified by other processes.

Of course one of the goals of SQLite, like most database engines, is concurrency - i.e. the ability for multiple processes to read and write data to the same database. As we can read in the documentation, "in order to maximize concurrency, SQLite works to minimize the amount of time that EXCLUSIVE locks are held".

That makes perfect sense if we use SQLite with a web application, which should handle as many concurrent requests as possible. But there are some scenarios when we might want to keep the database locked for exclusive access. For example, an SQLite database is often used as a cache for various data. If we run two instances of a program and both use the same cache, it may sometimes lead to strange results. Sometimes it's just easier to prevent such situation than to try to handle it. As you can guess, that's the situation I came across in the WebIssues client.

One way to achieve this is to simply use one large transaction throughout the lifetime of the connection. The documentation advises such solution when the database is used as the application file format. However, sometimes we want all modifications to be committed as soon as possible, for example to prevent losing data on a crash, while still keeping an exclusive lock as long as the connection is open.

Fortunately, SQLite has a useful pragma statement called locking_mode. If we set it to EXCLUSIVE, the obtained locks are never released until the connection is closed. However, the name of this setting is slightly misleading, because it doesn't really obtain an exclusive lock, it just ensures that the lock is never released. So perhaps "persistent" locking mode would be a better name.

In order to lock the database, we need to perform the following sequence of commands:

PRAGMA locking_mode = EXCLUSIVE
BEGIN EXCLUSIVE
COMMIT

The BEGIN EXCLUSIVE command actually locks the database. It begins an empty transaction, which doesn't change anything, but ensures that the exclusive lock is acquired on the database. If the database is already locked by another process, this command will fail with the SQLITE_BUSY error.

Note that the standard Qt driver for SQLite sets the busy timeout to 5 seconds by default, so the application will freeze for a while before reporting an error. We can change it using the QSQLITE_BUSY_TIMEOUT connection option, or when using a custom driver, by simply changing the parameter of sqlite3_busy_timeout to a smaller value.