Blog

Serialization in Qt - part 2

Submitted by mimec on 2012-07-09

In the previous post I wrote about support for different file formats in Qt and the pros and cons of using QDataStreama and a binary format. As I promised, today I will provide some more code. I will also start discussing various issues related to backward and forward compatibility of data files.

For now I will focus on simple cases like storing application settings or simple data like a list of bookmarks. I'm assuming that all serialized data have value-type semantics; i.e. they are stored and copied by value, not by pointer. A lot of classes in Qt are value types, including strings and all kinds of containers (as long as they store value types, not pointers). Also Qt makes it easy to create complex and efficient value types by using QSharedData and the copy on write mechanism, but that's an entirely different story.

In our example I will use the following simple Bookmark class:

class Bookmark
{
public:
    Bookmark();
    Bookmark( const Bookmark& other );
    ~Bookmark();

    const QString& name() const { return m_name; }
    // ... other getters/setters

    Bookmark& operator =( const Bookmark& other );

    friend QDataStream& operator <<( QDataStream& stream, const Bookmark& bookmark );
    friend QDataStream& operator >>( QDataStream& stream, Bookmark& bookmark );

private:
    QString m_name;
    QUrl m_url;
};

There is a default constructor, copy constructor and assignment operator; all that's needed for a value type. Thanks to this, we can store our objects in a container like QList. The two overloaded shift operators provide support for serialization. There's even no need to use Q_DECLARE_METATYPE, unless we need to put the bookmark in a QVariant or use it with asynchronous signal/slot connections.

The implementation of the shift operators is straightforward:

QDataStream& operator <<( QDataStream& stream, const Bookmark& bookmark )
{
    return stream << bookmark.m_name << bookmark.m_url;
}

QDataStream& operator >>( QDataStream& stream, Bookmark& bookmark )
{
    return stream >> bookmark.m_name >> bookmark.m_url;
}

This is serialization in its true sense: the bytes of the name string are directly followed by the bytes of the URL in the data stream. Without knowing the exact sequence of data, it's not possible to determine if the next byte is part of a string or an integer, and whether the string is part of the bookmark or some other structure. This means that extra care must be taken to read the data in exactly the same order as it was written.

While this is certainly efficient and makes the code extremely simple, there is a big problem when something needs to be changed or added. Let's suppose that a newer version of our application needs to support hierarchical bookmarks. We add a QList<Bookmark> m_children member to the Bookmark class and modify the implementation of the operators:

QDataStream& operator <<( QDataStream& stream, const Bookmark& bookmark )
{
    return stream << bookmark.m_name << bookmark.m_url << bookmark.m_children;
}

QDataStream& operator >>( QDataStream& stream, Bookmark& bookmark )
{
    return stream >> bookmark.m_name >> bookmark.m_url >> bookmark.m_children;
}

The QList automatically takes care of serializing all the items it contains by recursively calling the shift operator, so it might appear that nothing else needs to be done. But what happens if the user upgrades the application from the older version, and the new version tries to read the bookmarks file created by that previous version? It will expect the list of child bookmarks just after the URL, but actually it's some entirely different, random data. Attempting to interpret it as something it's not will give unexpected results and might even result in a crash.

The solution is to include a version tag in the stream, so that we can conditionally skip some fields when reading a file created by an older version of our application. For example:

QDataStream& operator <<( QDataStream& stream, const Bookmark& bookmark )
{
    return stream << (quint8)2 << bookmark.m_name << bookmark.m_url << bookmark.m_children;
}

QDataStream& operator >>( QDataStream& stream, Bookmark& bookmark )
{
    quint8 version;
    stream >> version >> bookmark.m_name >> bookmark.m_url;
    if ( version >= 2 )
        stream >> bookmark.m_children;
    return stream;
}

Note, however, that this would only work if the first version of the application also included the version tag in the stream! Otherwise we would attempt to read the first byte of the string as the version, again leading to unexpected results and potential crash. The lesson from this excercise is to think about this up front and always plan for the change.

Using a version tag is a good and universal solution, but it only provides backward compatibility: we can use it to correctly read files created by an older version. What happens if our application attempts to read a file created by a newer version of itself? We cannot predict what changes will be made in the future, so there's not much we can do to handle such situation. We can just close the stream and perhaps throw an exception to prevent the application from crashing.

Forward compatibility usually doesn't matter when it comes to simple configuration files. But what if the file is actually an important document that we need to send to someone else, who might have a slightly older version of the application? One solution would be to use a different format, like XML, but forward compatibility can also be achieved when using the QDataStream. I will write more about it in the next post.

Usually it doesn't make sense to include the version tag with each object, but just once at the beginning of the file. It's also a good idea to write a random "magic" value in the file header, to ensure that the file is really what we think it is. I use a class similar to the following one in my own applications to handle all this automatically:

class DataSerializer
{
public:
    DataSerializer( const QString& path ) :
        m_file( path )
    {
    }

    ~DataSerializer()
    {
    }

    bool openForReading()
    {
        if ( !m_file.open( QIODevice::ReadOnly ) )
            return false;

        m_stream.setDevice( &m_file );
        m_stream.setVersion( QDataStream::Qt_4_6 );

        qint32 header;
        m_stream >> header;

        if ( header != MagicHeader )
            return false;

        qint32 version;
        m_stream >> version;

        if ( version < MinimumVersion || version > CurrentVersion )
            return false;

        m_dataVersion = version;

        return true;
    }

    bool openForWriting()
    {
        if ( !m_file.open( QIODevice::WriteOnly | QIODevice::Truncate ) )
            return false;

        m_stream.setDevice( &m_file );
        m_stream.setVersion( QDataStream::Qt_4_6 );

        m_stream << (qint32)MagicHeader;
        m_stream << (qint32)CurrentVersion;

        m_dataVersion = CurrentVersion;

        return true;
    }

    QDataStream& stream() { return m_stream; }

    static int dataVersion() { return m_dataVersion; }

private:
    QFile m_file;
    QDataStream m_stream;

    static int m_dataVersion;

    static const int MagicHeader = 0xF517DA8D;

    static const int CurrentVersion = 1;
    static const int MinimumVersion = 1;
};

It's basically a wrapper over a file with an associated data stream. It ensures that both the magic header and the version are correct when opening the file for reading, and writes those values when opening it for writing. The CurrentVersion constant should be incremented every time something is added or changed in the serialization code of any class. The MinimumVersion constant allows us to skip support for some really old versions, especially when data format changed too much. The dataVersion static method makes it easy to check the actual version when reading data from the stream:

QDataStream& operator >>( QDataStream& stream, Bookmark& bookmark )
{
    stream >> bookmark.m_name >> bookmark.m_url;
    if ( DataSerializer::dataVersion() >= 2 )
        stream >> bookmark.m_children;
    return stream;
}

Note that the version is stored in a global variable, so this code is not thread safe or re-entrant, but so far it's been enough for me in all situations. More elaborate solutions may be created if necessary.

You can also notice that the DataSerializer explicitly sets the version of the data format to Qt_4_6. This is important, because Qt also has a similar versioning mechanism for it's own serialization format. This may cause problems when data is read and written using different versions of Qt libraries. Here we just enforce compatibility with the minimum supported version of Qt, in this case 4.6. Alternatively, we could store the Qt version in the header along with our internal version and verify both when opening the file.

In the next post I will write more about the forward compatibility problem and about serializing complex hierarchies of objects with cross references.

Serialization in Qt - part 1

Submitted by mimec on 2012-07-05

Almost all applications need to store some data and be able to read it later, whether it's a document file or just some application settings. The data can be anything from a few integers to a complex hierarchy of objects. Although the Qt framework doesn't have a built-in serialization support in the same sense as, for example, .NET or Java, it provides at least three mechanisms that can make storing and reading data easier:

  • QSettings - the standard Qt way of storing application settings. It supports both a variation of INI file format and platform specific storage, e.g Windows Registry.
  • QDomDocument - along with other classes from the QtXml module, it provides support for XML files.
  • QDataStream - can be used to read and write binary files.

Each solution has it's advantages and disadvantages. The XML format is sometimes considered as the only "right" way to store any kind of data. While it certainly has many advantages, the markup adds a lot of overhead, and being text based, it's not very suitable for storing data that is binary in it's nature. The INI format is perhaps more compact, but it's still text based and (arguably) human readable. Although it is possible to store anything that can be wrapped in a QVariant, for example a QImage, reading and writing such data is not very efficient (it has to be serialized in binary format and then converted to escaped textual representation). Also such INI file is no longer human readable, not to mention editable. That makes the benefit of using a INI file over a plain binary file questionable.

Personally I use QSettings only in two situations:

  • For manipulating Registry settings in a more comfortable way than by directly using the Windows API (for example to register a custom file extension).
  • For reading auxiliary configuration files that are rarely changed, but can be altered by the user in certain situations. For example, I store the list of available languages in an INI file. Because the list is not hard-coded, new translations can be created or installed without having to recompile the whole application.

Support for XML files is nice if we need to handle one of the numerous existing file formats which is based on XML, for example SVG, RSS or OpenDocument. However I personally don't see much point for a new, custom file format to be based on XML. Unless it needs to be embedded or mixed with other XML based file formats, or processed with a XSLT processor, using a binary format is usually a better idea. Sometimes XML based formats are seen as more "open", whatever that means, but from a technical point of view that's compeletely irrelevant. There are numerous examples of open, well documented binary formats.

A more reasonable argument is that XML based formats are more flexible, because new attributes and tags can be added without affecting compatibility with older and newer versions of the software. With some additional effort, this can also be achieved when using a binary format. I will write more about this topic in one of the next posts.

Another concern is the binary compatibility of data on various platform. QDataStream nicely takes care of it by ensuring proper endianness. We just have to use types like qint32 instead of the standard C++ types when reading from/writing to the stream, to ensure that data always has the same size. On the other hand, in case of XML it would be necessary to take ensure that numeric precision is not lost when converting values to/from text.

The advantage of binary serialization is that it's very simple, fast and memory efficient. There is no addional overhead of parsing the XML markup, storing the entire DOM tree in memory, etc. It also requires much less code that needs to be written. Manipulating the DOM tree is cumbersome and not very elegant, and using the more efficient SAX-style interface is even more difficult.

In the simplest case, the application settings can be represented by a single QVariantMap object (equivalent to QMap<QString, QVariant>). This is basically the same as what QSettings provides, except that the latter uses additional prefixes to emulate a hierarchy of groups. Note that almost anything can be a variant, including custom types, and even another QVariantMap. This makes it easy to create complex, nested data structures that can be saved and loaded back using a few lines of code:

QVariantMap settings;

MyClass instance;
settings.insert( "Key", QVariant::fromValue( instance ) );

QFile file( "settings.dat" );
file.open( QIODevice::ReadOnly );

QDataStream stream( &file );
stream << settings;

In order for a custom type to be serializable, it only has to implement the << and >> operators taking the data stream object. In addition, to be able to embed the custom type in a QVariant, it must be declared as a metatype using the Q_DECLARE_METATYPE macro and registered using the qRegisterMetaTypeStreamOperators function. I will post an example in the next article.

When reading settings back, it's important to remember about default values. Although defaults can be used when reading the values, it's often better to initialize default values which are missing from the map at startup, just after reading the configuration file. This way the default value is only provided once, and not everywhere it's used.

Note that we don't always have to use QVariant to serialize data. If we want to have a file which stores just a list of bookmarks, we can simply serialize a QList<Bookmark>. All we need is the pair of << and >> operators. There is no need to declare a metatype; the type is static, so it doesn't have to be dynamically resolved upon deserialization. Also note that the Bookmark could even contain a nested list of child bookmarks.

Impressions from Minecraft

Submitted by mimec on 2012-06-07

Before I get to the point, just a few updates. I just finished refreshing the components available on this website, so now they're finally all up to date. I also published some recent photos of Adam. He's growing and changing so quickly that it's hard to keep up :). Last but not least, I released version 1.0.2 of WebIssues some time ago with some minor fixes and improvements.

Why do I write about Minecraft? It's simply impossible to avoid it. I had a short adventure with the free Classic version in January, when Adam was still at the hospital. I knew I shouldn't even think about the full version, because I'd be stuck with it for a long time. I managed to forget about it until recently I bought May Payne III and I was looking for an online review. Then I accidentally found that an Xbox version of Minecraft is available. I didn't even finish playing Max Payne (which is great, by the way, almost as much as the original version was 12 years ago) and bought Minecraft. It's not hard to guess that it costed me a few nights with hardly any sleep.

I think that there are already a few dissertations about the Minecraft phenomena. A world made of blocks that you can freely dig and move around is simply every geek's dream. Games like Max Payne have great graphics and action, but it's sometimes annoying that you can't just get off the linear path and venture into the Sao Paulo favelas. Games like GTA don't solve the problem - you can go anywhere you want, but there's not much interaction with the world other than shooting people and running them over. In Minecraft there are no limits. You are a god in a digital world. Add to it a few zombies and retro style graphics and you have a recipe for success.

The Xbox version of Minecraft is still quite bit behind the PC version. It lacks enchanting, potions, a lot of biomes (there is only forest and desert, at least in my world) and structures like villages and strongholds. Hopefully they will add these features soon, but even as it is now, it's a lot of fun to play. The size of the world is limited to 1024x1024 blocks, but it's more than enough for a single player or even a few players. It took me quite a while to travel around the entire world using a boat. Also there is still a lot of bugs. It happened to me once that after saving the game deep underground and loading it again, I was moved to the surface, surrounded by three creepers. It was quite annoying :).

I could just as well buy the PC version, but I like to separate playing from working, so I prefer not to have any games on my laptop :). Not to mention the effect of having a big screen and surround sound. Mouse and keyboard is probably more comfortable for this type of game, but you can get used to the pad. At the moment I'm a little bit burnt out, and I can definitely use a break. Fortunately in a few days we're going to Germany to visit my wife's sister. Besides the Euro Championship is more important at the moment :). But Minecraft is definitely the kind of game that you want to keep returning to, especially when new updates will be released. Keep up the good work, Mojang!

Tags

Simple XML-based UI builder for Qt4

Submitted by mimec on 2012-05-28
XMLUI

Introduction

This library provides a tool strip widget, replacing classic menu bar and toolbars, and facilities for defining and merging the layout of actions from multiple components, using simple XML files.

Tool strips have several advantages over traditional menu bar and toolbars. Unlike menus, all most commonly used actions can remain accessible with a single mouse click, while it is still possible to put less commonly used actions in popup menus attached to tool buttons. On the other hand, actions can be logically grouped and visually distinguished much better than in a traditional toolbar, which is simply a long row of similar looking icons.

The XmlUi library also provides a set of classes which simplify building the tool strip and popup menus. The layout can be defined using XML files which allows changing them easily without modifying the code. They also allow the layout of actions to be merged from multiple components, which is most useful in applications which embed various types of views or custom plug-ins.

The first version of XmlUi was inspired by the KXMLGUI classes from the KDE libraries. Later the tool strip widget was added and a simplified version of the Windows Modern Style was incorporated to provide a better look and feel for tool strips and menus. If you prefer to use traditional menu bar and toolbars, you can use the older version of XmlUi.

Documentation

You can find the full documentation for this article at doc.mimec.org/articles/xmlui/. It is also included in the source package.

History

2.1 (2012-05-28)

  • fixed toolstrip appearance on Mac OS X
  • added the execMenu() and toolStrip() helper functions
  • display shortcut of default menu item in button's tooltip when available

2.0 (2011-12-19)

  • added the new toolstrip control
  • integrated the modern Windows style

1.1 (2009-11-23)

  • added: support for toolbar buttons with menus
  • added: styling splitters in main windows
  • fixed: improved appearance of styled tab widgets
  • fixed: painting undocked toolbars

1.0 (2008-06-23)

  • initial version

Downloads

This code can be freely used and modified in both open source applications (including GPL) and commercial applications.

Reflections from POP diaries

Submitted by mimec on 2012-05-09

Recently I came across Jordan Mechner's blog and the news that he just found the original source code of Prince of Persia on some old floppy disks after being lost for 22 years. That made me think about the time when I first played POP; I was no more than 10 years old and it was one of the first computer games I've seen. It was about that time when I started thinking that computers are fun and that I want to learn programming and create games myself.

I wonder if I also still have some floppy disks from Amiga 500 (and later Amiga 1200) hidden somewhere, with old pieces of code written by me. The oldest program that I wrote which survived to this day is called Polyglot. I wrote it in 1997 (being 15 years old) under the nick name "CompLex". It is still available in the Aminet archives, although only in binary form. I no longer have the source code. Maybe it still exists on the hard drive which I damaged many years ago by screwing it with too long screws which caused a short-circuit :). The oldest source code which I still have is Grape3D, written almost 12 years ago. It's almost completely unreadable, with lots of bitwise operations, pointer math, abbreviated variable names and literally zero comments, but it remains a really ingenious work of art that would be hard for me to match today.

I also read Jordan's diaries from making POP in late 80s and early 90s. It's really interesting and also quite inspiring. It also reminded me that I kept a diary between 1999 and 2007. It was mostly dedicated to various frustrations caused by my social life (or the lack of it), girls (or the inability to meet any), and general uncertainty of what I should do and what awaits me in the future. There are few mentions about the programs that I were writing at that time, because I deliberately avoided that topic. Anyway, from the perspective of a decade, life doesn't seem as bad as it used to, but it's definitely not getting any easier. It's just running much faster.

Jordan wrote a lot about his dilemma whether to write computer games or movie scripts. It's quite similar to the problem I currently have, trying to reconcile writing open source programs and the novel that I'm working on. I guess that's just the problem of people that are too creative :). There are a few major differences, though: Jordan had royalties from Karateka, and I need a full time job for living and for paying my loans; he was 21 when he started and I already turned 30 and have a wife and a kid to look after. So I'm not in a great position to disappear for half a year and write a bestseller book, or to invest in starting my own software company.

I really can't complain about my job, but I can't imagine working as an "outsourced" developer for the rest of my life, and being paid by the hour and not by the actual value of what I create. This is actually kind of frustrating and counter-productive, because the better and more efficient I work, the less I get paid for it. There are some ideas on the horizon how to change, or at least improve this situation. Perhaps I will finally be able to make some profit from the countless hours I spent on WebIssues. But so far, the only way I can do something to make me feel more accomplished is to pull all-nighters. I'm even doing it now writing this post. That's also not something I want to do for the rest of my life. Living from WebIssues royalties, travelling and writing books sounds much better.

Another lesson from Jordan's diaries is that even if you do a great job, there are still many things that may go wrong. Poor marketing decisions almost sank Prince of Persia, even though it was getting excellent reviews. I fear the same may happen to the commercial version of WebIssues. I know the value of this project; it can successfully compete with other applications, and the competition in this sector of the market, both open source and commercial, is very strong. But being able to make a profit from it is a completely different story. Of course, the only way to find out is to take the chance, and I will do it, but until I see some serious action going on, I will remain moderately enthusiastic about it.

Oh, and by the way, a new version of WebIssues is coming out probably next week. I'm just waiting for one Mac related bug to be fixed. And in the meantime I'm making some last minute improvements.