programming

Serialization in Qt - part 2

In the previous post I wrote about support for different file formats in Qt and the pros and cons of using QDataStreama and a binary format. As I promised, today I will provide some more code. I will also start discussing various issues related to backward and forward compatibility of data files.

For now I will focus on simple cases like storing application settings or simple data like a list of bookmarks. I'm assuming that all serialized data have value-type semantics; i.e. they are stored and copied by value, not by pointer. A lot of classes in Qt are value types, including strings and all kinds of containers (as long as they store value types, not pointers). Also Qt makes it easy to create complex and efficient value types by using QSharedData and the copy on write mechanism, but that's an entirely different story.

In our example I will use the following simple Bookmark class:

class Bookmark
{
public:
    Bookmark();
    Bookmark( const Bookmark& other );
    ~Bookmark();

    const QString& name() const { return m_name; }
    // ... other getters/setters

    Bookmark& operator =( const Bookmark& other );

    friend QDataStream& operator <<( QDataStream& stream, const Bookmark& bookmark );
    friend QDataStream& operator >>( QDataStream& stream, Bookmark& bookmark );

private:
    QString m_name;
    QUrl m_url;
};

There is a default constructor, copy constructor and assignment operator; all that's needed for a value type. Thanks to this, we can store our objects in a container like QList. The two overloaded shift operators provide support for serialization. There's even no need to use Q_DECLARE_METATYPE, unless we need to put the bookmark in a QVariant or use it with asynchronous signal/slot connections.

The implementation of the shift operators is straightforward:

QDataStream& operator <<( QDataStream& stream, const Bookmark& bookmark )
{
    return stream << bookmark.m_name << bookmark.m_url;
}

QDataStream& operator >>( QDataStream& stream, Bookmark& bookmark )
{
    return stream >> bookmark.m_name >> bookmark.m_url;
}

This is serialization in its true sense: the bytes of the name string are directly followed by the bytes of the URL in the data stream. Without knowing the exact sequence of data, it's not possible to determine if the next byte is part of a string or an integer, and whether the string is part of the bookmark or some other structure. This means that extra care must be taken to read the data in exactly the same order as it was written.

While this is certainly efficient and makes the code extremely simple, there is a big problem when something needs to be changed or added. Let's suppose that a newer version of our application needs to support hierarchical bookmarks. We add a QList<Bookmark> m_children member to the Bookmark class and modify the implementation of the operators:

QDataStream& operator <<( QDataStream& stream, const Bookmark& bookmark )
{
    return stream << bookmark.m_name << bookmark.m_url << bookmark.m_children;
}

QDataStream& operator >>( QDataStream& stream, Bookmark& bookmark )
{
    return stream >> bookmark.m_name >> bookmark.m_url >> bookmark.m_children;
}

The QList automatically takes care of serializing all the items it contains by recursively calling the shift operator, so it might appear that nothing else needs to be done. But what happens if the user upgrades the application from the older version, and the new version tries to read the bookmarks file created by that previous version? It will expect the list of child bookmarks just after the URL, but actually it's some entirely different, random data. Attempting to interpret it as something it's not will give unexpected results and might even result in a crash.

The solution is to include a version tag in the stream, so that we can conditionally skip some fields when reading a file created by an older version of our application. For example:

QDataStream& operator <<( QDataStream& stream, const Bookmark& bookmark )
{
    return stream << (quint8)2 << bookmark.m_name << bookmark.m_url << bookmark.m_children;
}

QDataStream& operator >>( QDataStream& stream, Bookmark& bookmark )
{
    quint8 version;
    stream >> version >> bookmark.m_name >> bookmark.m_url;
    if ( version >= 2 )
        stream >> bookmark.m_children;
    return stream;
}

Note, however, that this would only work if the first version of the application also included the version tag in the stream! Otherwise we would attempt to read the first byte of the string as the version, again leading to unexpected results and potential crash. The lesson from this excercise is to think about this up front and always plan for the change.

Using a version tag is a good and universal solution, but it only provides backward compatibility: we can use it to correctly read files created by an older version. What happens if our application attempts to read a file created by a newer version of itself? We cannot predict what changes will be made in the future, so there's not much we can do to handle such situation. We can just close the stream and perhaps throw an exception to prevent the application from crashing.

Forward compatibility usually doesn't matter when it comes to simple configuration files. But what if the file is actually an important document that we need to send to someone else, who might have a slightly older version of the application? One solution would be to use a different format, like XML, but forward compatibility can also be achieved when using the QDataStream. I will write more about it in the next post.

Usually it doesn't make sense to include the version tag with each object, but just once at the beginning of the file. It's also a good idea to write a random "magic" value in the file header, to ensure that the file is really what we think it is. I use a class similar to the following one in my own applications to handle all this automatically:

class DataSerializer
{
public:
    DataSerializer( const QString& path ) :
        m_file( path )
    {
    }

    ~DataSerializer()
    {
    }

    bool openForReading()
    {
        if ( !m_file.open( QIODevice::ReadOnly ) )
            return false;

        m_stream.setDevice( &m_file );
        m_stream.setVersion( QDataStream::Qt_4_6 );

        qint32 header;
        m_stream >> header;

        if ( header != MagicHeader )
            return false;

        qint32 version;
        m_stream >> version;

        if ( version < MinimumVersion || version > CurrentVersion )
            return false;

        m_dataVersion = version;

        return true;
    }

    bool openForWriting()
    {
        if ( !m_file.open( QIODevice::WriteOnly | QIODevice::Truncate ) )
            return false;

        m_stream.setDevice( &m_file );
        m_stream.setVersion( QDataStream::Qt_4_6 );

        m_stream << (qint32)MagicHeader;
        m_stream << (qint32)CurrentVersion;

        m_dataVersion = CurrentVersion;

        return true;
    }

    QDataStream& stream() { return m_stream; }

    static int dataVersion() { return m_dataVersion; }

private:
    QFile m_file;
    QDataStream m_stream;

    static int m_dataVersion;

    static const int MagicHeader = 0xF517DA8D;

    static const int CurrentVersion = 1;
    static const int MinimumVersion = 1;
};

It's basically a wrapper over a file with an associated data stream. It ensures that both the magic header and the version are correct when opening the file for reading, and writes those values when opening it for writing. The CurrentVersion constant should be incremented every time something is added or changed in the serialization code of any class. The MinimumVersion constant allows us to skip support for some really old versions, especially when data format changed too much. The dataVersion static method makes it easy to check the actual version when reading data from the stream:

QDataStream& operator >>( QDataStream& stream, Bookmark& bookmark )
{
    stream >> bookmark.m_name >> bookmark.m_url;
    if ( DataSerializer::dataVersion() >= 2 )
        stream >> bookmark.m_children;
    return stream;
}

Note that the version is stored in a global variable, so this code is not thread safe or re-entrant, but so far it's been enough for me in all situations. More elaborate solutions may be created if necessary.

You can also notice that the DataSerializer explicitly sets the version of the data format to Qt_4_6. This is important, because Qt also has a similar versioning mechanism for it's own serialization format. This may cause problems when data is read and written using different versions of Qt libraries. Here we just enforce compatibility with the minimum supported version of Qt, in this case 4.6. Alternatively, we could store the Qt version in the header along with our internal version and verify both when opening the file.

In the next post I will write more about the forward compatibility problem and about serializing complex hierarchies of objects with cross references.

Filed under: Blog

Serialization in Qt - part 1

Almost all applications need to store some data and be able to read it later, whether it's a document file or just some application settings. The data can be anything from a few integers to a complex hierarchy of objects. Although the Qt framework doesn't have a built-in serialization support in the same sense as, for example, .NET or Java, it provides at least three mechanisms that can make storing and reading data easier:

  • QSettings - the standard Qt way of storing application settings. It supports both a variation of INI file format and platform specific storage, e.g Windows Registry.
  • QDomDocument - along with other classes from the QtXml module, it provides support for XML files.
  • QDataStream - can be used to read and write binary files.

Each solution has it's advantages and disadvantages. The XML format is sometimes considered as the only "right" way to store any kind of data. While it certainly has many advantages, the markup adds a lot of overhead, and being text based, it's not very suitable for storing data that is binary in it's nature. The INI format is perhaps more compact, but it's still text based and (arguably) human readable. Although it is possible to store anything that can be wrapped in a QVariant, for example a QImage, reading and writing such data is not very efficient (it has to be serialized in binary format and then converted to escaped textual representation). Also such INI file is no longer human readable, not to mention editable. That makes the benefit of using a INI file over a plain binary file questionable.

Personally I use QSettings only in two situations:

  • For manipulating Registry settings in a more comfortable way than by directly using the Windows API (for example to register a custom file extension).
  • For reading auxiliary configuration files that are rarely changed, but can be altered by the user in certain situations. For example, I store the list of available languages in an INI file. Because the list is not hard-coded, new translations can be created or installed without having to recompile the whole application.

Support for XML files is nice if we need to handle one of the numerous existing file formats which is based on XML, for example SVG, RSS or OpenDocument. However I personally don't see much point for a new, custom file format to be based on XML. Unless it needs to be embedded or mixed with other XML based file formats, or processed with a XSLT processor, using a binary format is usually a better idea. Sometimes XML based formats are seen as more "open", whatever that means, but from a technical point of view that's compeletely irrelevant. There are numerous examples of open, well documented binary formats.

A more reasonable argument is that XML based formats are more flexible, because new attributes and tags can be added without affecting compatibility with older and newer versions of the software. With some additional effort, this can also be achieved when using a binary format. I will write more about this topic in one of the next posts.

Another concern is the binary compatibility of data on various platform. QDataStream nicely takes care of it by ensuring proper endianness. We just have to use types like qint32 instead of the standard C++ types when reading from/writing to the stream, to ensure that data always has the same size. On the other hand, in case of XML it would be necessary to take ensure that numeric precision is not lost when converting values to/from text.

The advantage of binary serialization is that it's very simple, fast and memory efficient. There is no addional overhead of parsing the XML markup, storing the entire DOM tree in memory, etc. It also requires much less code that needs to be written. Manipulating the DOM tree is cumbersome and not very elegant, and using the more efficient SAX-style interface is even more difficult.

In the simplest case, the application settings can be represented by a single QVariantMap object (equivalent to QMap<QString, QVariant>). This is basically the same as what QSettings provides, except that the latter uses additional prefixes to emulate a hierarchy of groups. Note that almost anything can be a variant, including custom types, and even another QVariantMap. This makes it easy to create complex, nested data structures that can be saved and loaded back using a few lines of code:

QVariantMap settings;

MyClass instance;
settings.insert( "Key", QVariant::fromValue( instance ) );

QFile file( "settings.dat" );
file.open( QIODevice::ReadOnly );

QDataStream stream( &file );
stream << settings;

In order for a custom type to be serializable, it only has to implement the << and >> operators taking the data stream object. In addition, to be able to embed the custom type in a QVariant, it must be declared as a metatype using the Q_DECLARE_METATYPE macro and registered using the qRegisterMetaTypeStreamOperators function. I will post an example in the next article.

When reading settings back, it's important to remember about default values. Although defaults can be used when reading the values, it's often better to initialize default values which are missing from the map at startup, just after reading the configuration file. This way the default value is only provided once, and not everywhere it's used.

Note that we don't always have to use QVariant to serialize data. If we want to have a file which stores just a list of bookmarks, we can simply serialize a QList<Bookmark>. All we need is the pair of << and >> operators. There is no need to declare a metatype; the type is static, so it doesn't have to be dynamically resolved upon deserialization. Also note that the Bookmark could even contain a nested list of child bookmarks.

Filed under: Blog

Tooltips for truncated items in a QTreeView

It is quite common in various applications to display tooltips for truncated items in list views and tree views. Such functionality was present in Qt 3, but in Qt 4 the application, or rather the model, is fully responsible for providing the tooltip using the Qt::ToolTipRole and such automatic behavior no longer exist. You can obviously return the same text for both Qt::DisplayRole and Qt::ToolTipRole, but then tooltips are shown for all items, whether they are truncated or not. It doesn't look very well.

It's surprisingly hard to find a solution. The best I could find was this thread on the qt-interest mailing list. It suggests subclassing the view and overriding the tooltip event. I felt that there must be a better way, so I looked into the source code of QAbstractItemView. It turned out that since Qt 4.3, handling tooltips (and various other help events) is delegated to... the item delegate.

The definition of a custom item delegate may look like this:

class AutoToolTipDelegate : public QStyledItemDelegate
{
    Q_OBJECT
public:
    AutoToolTipDelegate( QObject* parent );
    ~AutoToolTipDelegate();

public slots:
    bool helpEvent( QHelpEvent* e, QAbstractItemView* view, const QStyleOptionViewItem& option,
        const QModelIndex& index );
};

Notice that the helpEvent method is a slot. It should be a virtual method; however adding a new virtual method to an existing class would break binary compatibility with earlier versions of the Qt library, so instead this method is invoked dynamically using the slots mechanism.

In order to check if the given item is truncated or not, we simply have to compare its visual rectangle (which can be retrieved from the view) with the size hint (provided by the item delegate itself). The full code of the helpEvent method looks like this:

bool AutoToolTipDelegate::helpEvent( QHelpEvent* e, QAbstractItemView* view,
    const QStyleOptionViewItem& option, const QModelIndex& index )
{
    if ( !e || !view )
        return false;

    if ( e->type() == QEvent::ToolTip ) {
        QRect rect = view->visualRect( index );
        QSize size = sizeHint( option, index );
        if ( rect.width() < size.width() ) {
            QVariant tooltip = index.data( Qt::DisplayRole );
            if ( tooltip.canConvert<QString>() ) {
                QToolTip::showText( e->globalPos(), QString( "<div>%1</div>" )
                    .arg( Qt::escape( tooltip.toString() ) ), view );
                return true;
            }
        }
        if ( !QStyledItemDelegate::helpEvent( e, view, option, index ) )
            QToolTip::hideText();
        return true;
    }

    return QStyledItemDelegate::helpEvent( e, view, option, index );
}

If the item is truncated, the display text is retrieved and displayed as a tooltip. Otherwise the default handler is called, so a custom tooltip may be displayed. If you want, you may reverse this behavior and only display the automatic tooltip if there is no custom one, or remove the call to the default handler if there are no custom tooltips.

Also notice that the text is wrapped into a <div> tag. That's in case the text is really long. When a HTML text is passed to the tooltip, it will be automatically wrapped into multiple lines if necessary. Otherwise the entire text would be displayed as a single line which may not fit on the screen. The Qt::escape method replaces any special characters with HTML entities to ensure that the text is displayed correctly.

All we have to do to enable automatic tooltips for a view is to assign our delegate to it:

    view->setItemDelegate( new AutoToolTipDelegate( view ) );

Note that it will also work for other kinds of views, not only QTreeView.

Filed under: Blog

Loading OpenGL functions

I already wrote about using OpenGL 3.3 with Qt applications, using new style shaders and helper classes for handling shader programs and buffers. But there is one more important thing to do before we can start writing OpenGL 3.3 applications with Qt. The problem is that usually functions and constants from OpenGL 3.3 won't be available even if we have the appropriate libraries and drivers. That's simply how OpenGL works and we have to work around this limitation.

The qgl.h header, which is used by all other headers from the QtOpenGL module, includes <GL/gl.h> (or its equivalent, depending on the platform). However, on Windows this standard header is always compatible with version 1.1 of OpenGL (even if you have the latest Platform SDK), and on systems using recent versions of MESA (including most Linuxes) it's compatible with version 1.3. To have all the new symbols from version 3.3, you need to include <GL/glext.h>, but it also doesn't help much. First, this header is not available on Windows. Second, it only defines typedefs for function pointers that you have to retrieve by yourself using a platform-specific function, because they are not directly exported by the OpenGL library like in case of most other APIs. And even if they were, they may not be available on some platforms, depending on the actual version and available extensions, and you may still want your code to work without some of them.

There are some existing libraries that attempt to solve this problem by automatically loading those functions behind the scenes. The most popular ones are GLEW and GL Load (which is a part of the Unofficial GL SDK). They are cool but both are relatively huge (well over 2 MB of header files and source code) for a simple task of loading a few dozens of functions. They include a bunch of extensions which are not part of the OpenGL 3.3 core profile. They are also meant to completely replace <GL/gl.h>, and although they work with Qt, it's not an elegant solution.

Qt itself also has a rather funny approach to this problem. All classes that require 2.0+ functionality (shaders, buffers, etc.) use an internal header, qglextensions_p.h. It works in a somewhat similar way to those libraries. It defines the function pointer types and constants and then defines macros which replace canonical function names with appropriate entries in an internal structure which is stored in the QGLContext. Obviously we cannot rely on it because it's internal, and besides it only defines a small set of functions and constants which are directly used by Qt.

There is also a public class QGLFunctions which is part of the API, though it's not internally used by Qt. It takes a completely different approach and instead of using macros, it's a class with methods of the same name as canonical OpenGL functions. The recommended way to use it is to inherit this class in each class that needs to use those functions. It seems like a bit of WTF to me. Even worse, it only covers OpenGL/ES 2.0 which is fine for embedded applications, but not enough for a desktop application targeting OpenGL 3.3.

As you can probably guess I came up with a custom solution. The idea is that it only needs to add symbols not already defined in <GL/gl.h>, assuming that it's compatible with at least OpenGL 1.1. It also only covers the OpenGL 3.3 core profile without any additional extensions or features removed from the core profile (though those defined by <GL/gl.h> will still be available). It consists of a header file which is basically a slightly stripped version of gl3.h from the official OpenGL Registry. I basically removed everything pre-1.2 and post-3.3 and some other unnecessary stuff. Another header defines a structure holding all function pointers and all the necessary macro definitions, and a single source file contains code that initializes this structure using a QGLContext, which takes care of retrieving function pointers in a cross-platform way.

The size of all three files is a mere 120 kilobytes. Some day I may publish them as a separate mini-libary, but for now you can find them in the SVN repository of Descend.

Filed under: Blog

QGLShaderProgram and QGLBuffer

In the previous article I wrote that using modern OpenGL (i.e. version 3.0 and above) is possible, although the core profile cannot be used yet. I also mentioned this article which briefly describes how to use the core profile, although in fact this example will also work in the default compatibility mode. In this mode we can use both the fixed pipeline and shaders, but I will focus on the "modern" approach.

Qt has a handy class called QGLShaderProgram which wraps the OpenGL API related to shaders. A big advantage of this class is that it supports all classes related to 3D graphics provided by Qt, such as QVector3D and QMatrix4x4, as well as basic types like QColor. This way we don't have to worry about converting those types to OpenGL types. Internally this class is little more than a GLuint storing the handle of the shader program and most its methods are simple wrappers around functions like glUniform3fv so it's very lightweight.

Note, however, that shaders work in quite a different way depending on the version of the GLSL specification. By default version 1.20 is assumed, so your shaders can access all information known from the fixed pipeline - vertex position, normal, texture coordinates, transformation matrices, lighting parameters, etc. Things change dramatically when you put the following declaration at the beginning of the shader:

#version 330

Any attempt to access these built-in uniforms and attributes will result in an error. It means that you have to pass all information using explicitly declared uniforms and attributes. For example, to define the world-to-camera transformation matrix, you could use the following code:

    QMatrix4x4 view;
    view.translate( 0.0, 0.0, -CameraDistance );
    view.rotate( m_angle, 1.0, 0.0, 0.0 );
    view.rotate( m_rotation, 0.0, 0.0, 1.0 );
    m_program.setUniformValue( "ViewMatrix", view );

This is not only much more elegant than a series of calls to glMatrixMode, glIdentity, glRotate etc., but also faster and more flexible. The vector and matrix classes provided by Qt are really handy; the authors of this class even thought about the normalMatrix method that calculates the transposed inverse (or was it inversed transpose?) for transforming normal vectors.

Similarly, uniforms can be used to pass lighting parameters, materials, blending information and many more things which are not possible to achieve using the fixed pipeline. When it comes to attributes, the QGLShaderProgram offers a bunch of functions for passing single values to attributes (which are not very useful in most cases) and for passing arrays of various types. However this is not recommended, because OpenGL knows nothing about the contents of these arrays and it cannot assume that they don't change between executions of the shader or between successive frames.

A much better approach is to use the setAttributeBuffer method in connection with the QGLBuffer class. Internally this method is a wrapper for glVertexAttribPointer just like the attribute array methods, but it makes the code much more readable as it explicitly states that vertex buffers are used. In addition there's no need to cast the offset to a pointer because Qt will do that for us.

The QGLBuffer class is also a very thin wrapper around a GLuint representing the vertex buffer object (or index buffer or pixel buffer object). Unlike QGLShaderProgram it's a value type (it doesn't make sense to copy a program anyway), so we can share buffers without having to worry about tracking and releasing them when they are no longer needed.

In order to use the QGLBuffer, we need to create it and fill it with data; then we can bind it with the attributes of the shader program. By using appropriate offset and stride, we can easily bind multiple attributes to a single buffer; usually all attributes of a single vertex would be stored together, followed by the remaining vertices. Don't forget about calling enableAttributeArray for each attribute. We can also use another instance of QGLBuffer to store the indexes.

When everything is set up like this, the rendering is a matter of binding the program and both buffers to the context and calling glDrawElements. In more complex scenarios we can use multiple vertex array objects to store the bindings between vertex buffers and attributes. But since we're not using the core profile, OpenGL will create an implicit vertex array object for us.

We can also use uniform buffer objects to simplify passing lots of uniforms to multiple programs. Although Qt doesn't support them at the moment, there is a simple hack which allows us to abuse QGLBuffer. If you look at the declaration of this class you will notice that the values of the enumeration defining the type of a buffer are the same as the corresponding target constants in OpenGL. So we could simply pass GL_UNIFORM_BUFFER as the type of the buffer - I haven't tested it yet, but it should work.

Filed under: Blog
Syndicate content