Protocol of developing an animation texture tool Attribute-based transient Tombstoning

Requirements of and pitfalls in Windows Phone 7 serialization

Published on Wednesday, December 8, 2010 1:50:00 PM UTC in Programming

Last time I've blogged about how I like the data contract serializer to persist data in tombstoning situations on the Windows Phone. Since then, I was contacted by a few developers who had problems with some of the implications that kind of serialization has. The reason for this partly is that designing your types for serialization is not as trivial as it seems at first, but also due to some very misleading information you'll run into when you receive errors during the process of serialization. So I decided to put together a list of requirements and pitfalls you have to take into consideration when you want to use the data contract serializer. I'll start with some rather trivial things and proceed to not so obvious details later on. Some of those issues also are contrary to good object-oriented practices. I'm trying to point them out so you can decide whether they're acceptable to you or not. Please note that a lot of the following is not specific to Windows Phone, but applies to Silverlight and .NET.

Your types need to be public

Types decorated with the data contract attribute must be public. The error message you receive for non-public types is not very helpful. For example:

[DataContract]
internal class Data
{

}

If you try to serialize this internal class, you receive a security exception without any further explanation or details about what is wrong. Make sure all your classes that are part of the data contract are public. Another solution to this is to expose internal types to the assembly the data contract serializer sits in. This can be done with an assembly-scoped attribute (for example in your AssemblyInfo.cs file):

[assembly: InternalsVisibleTo("System.Runtime.Serialization")]

Please note that this is not recommended though, and in Silverlight the documentation of this attribute also says:

This API supports the .NET Framework infrastructure and is not intended to be used directly from your code.

It works however, and if you don't want to expose your types due to good practices or other design decisions, you should give it a try.

Your data members need to be public

When you decorate a property or a field of your types with the data member attribute, it has to be publicly accessible. Unfortunately, when you try to include a private member, the error message you receive is completely misleading. Example:

[DataContract]
public class Data
{
    [DataMember]
    private string _demo;
}

This will lead to an error message that says:

The data contract type 'WP7SerializerTest.Data' is not serializable because it is not public. Making the type public will fix this error. Alternatively, you can make it internal, and use the InternalsVisibleToAttribute attribute on your assembly in order to enable serialization of internal members - see documentation for more details. Be aware that doing so has certain security implications.

Clearly the type already is public and the real cause for the error is the non-public property that is marked with the data member attribute. In fact, the error message is the one I would've expected in the first example above. In this situation, it's simply wrong. The solution to the problem is obvious for this simple example, but I can see how people are pulling their hair out when they have a huge class with lots of properties and don't know how to solve the problem based on that error message.

From a design point of view this can be a very bad requirement. I often work with private or protected properties that really are not meant to be visible or even manipulated from the outside, and exposing them to the public just for serialization is hard to justify for a lot of developers.

Your data members need public setters and getters

If you decorate properties with the data member attribute, they need to have both a public getter and setter on Windows Phone. Depending on the property, you will either receive an error during serialization or when you try to deserialize it:

[DataContract]
public class Data
{
    private string _demo;

    [DataMember]
    public string Demo
    {
        get
        {
            return _demo;
        }
    }

    public Data()
    {
        _demo = "Blubb";
    }
}

A missing setter will be detected during the process of serialization:

No set method for property 'Demo' in type 'WP7SerializerTest.Data'.

[DataContract]
public class Data
{
    [DataMember]
    public string Demo
    {
        get;
        private set;
    }

    public Data()
    {
        Demo = "Blubb";
    }
}

This on the other hand will result in an error during deserialization, due to the limited trust level:

The data contract type 'WP7SerializerTest.Data' cannot be deserialized in partial trust because the property 'Demo' does not have a public setter.

Properties without setters often are calculated properties that solely depend on other properties. The solution then is simple, you don't include those in the data contract. The second case however can really cause moral conflicts. Having a non-public setter often is good design in scenarios where you want to avoid that a property is set externally. Removing that restriction can require a lot of discipline, especially when multiple developers are working with the same code. When a critical property can be changed from the outside, your data may end up with inconsistencies and introduce subtle errors to your application.

Your constructors will never be called

This is something where a lot of people trip and fall. It's not obvious unless you know it or read it in the documentation, and it can result in those moments where you debug an error and want to scream "impossible!" because what you see is, well, simply not possible. Except with serialization, which makes the impossible possible. Take this example:

[DataContract]
public class Data
{
    private string _demo;

    public Data()
    {
        _demo = "Blubb";
    }
}

One would assume that after the creation of a Data object, "_demo" has the value "Blubb", which is true in normal circumstances. However, when that type is deserialized, the constructor is never called. That means that when you use the data contract serializer and pull one of these objects from isolated storage for example, the value of "_demo" will be null, which is the default value of the string type.

A solution to this is to move all the initialization logic to a separate method, and call that method both from the constructor (for normal instantiation) and manually after deserialization. If you want to make sure the object is not initialized multiple times, you can add a simple flag to your class you set after and check before you run the initialization code.

Your references won't be restored automatically

One of my favorites. It can introduce everything from very subtle errors to complete failure, and especially the first category can be really disheartening. Take the following example, which I've created after an actual case I came across in a real project:

[DataContract]
public class Data
{
    [DataMember]
    public SubData SelectedSubData
    {
        get;
        set;
    }

    [DataMember]
    public IList<SubData> SubData
    {
        get;
        set;
    }
}

[DataContract]
public class SubData
{
    [DataMember]
    public string Content
    {
        get;
        set;
    }

    public SubData(string content)
    {
        Content = content;
    }
}

What we've got here is a simple situation where an object has a list of items and keeps track of the currently selected item. To test the behavior, you can use the following code:

// set up the data
Data data = new Data();
data.SubData = new List<SubData>()
{
    new SubData("First"),
    new SubData("Second")
};
data.SelectedSubData = data.SubData.First();

// serialize and deserialize...

bool equal = data.SelectedSubData.Content == data.SubData.First().Content; // true
equal = data.SelectedSubData == data.SubData.First(); // false!

The interesting result after running that code is that you end up with three items instead of two. The "SelectedSubData" property is serialized and deserialized as an independent object, and the last line that compares the references will fail. SelectedSubData, although it has the exact same content as the first item in the list, is not identical to that item. If you'd try to get the index of the SelectedSubData item in the SubData list by calling IndexOf(), it would return -1. This is an example of an subtle error, because everything succeeds at first, but may fail later on. Or even worse, lead to data inconsistency or corruption that is not noticed until it is too late.

In other cases, when you're working with a lot of references, you may come across situations where the serialization process seems to stall or even aborts with an exception. For example, when you have a lot of cross-linked objects with references to neighbors, parents and children, you may end up with thousands of serialized objects (that actually all are the same), or the process might never come to an end due to circular references.

In the above example, you could work around the problem by storing the selected index instead, and compute the selected item at runtime with that information. Since the selected index is a value type, the state of the object would be correct after deserialization. However, there is a simpler solution to this.

[DataContract]
public class Data
{
    [DataMember]
    public SubData SelectedSubData
    {
        get;
        set;
    }

    [DataMember]
    public IList<SubData> SubData
    {
        get;
        set;
    }
}

[DataContract(IsReference = true)]
public class SubData
{
    [DataMember]
    public string Content
    {
        get;
        set;
    }

    public SubData(string content)
    {
        Content = content;
    }
}

Please note the additional "IsReference = true" argument for the data contract attribute of the SubData class. This attribute indicates to the serializer that object references for this type should be preserved. When you run the sample code from above with this changed definition, you'll see that the items are restored correctly this time. This is achieved by the use of additional reference information in the XML output.

You may ask yourself why "IsReference" doesn't have a default value of true, because a contract most likely consists of reference types where you want to avoid those subtle errors. I believe this may have historic reasons. Formerly, non-standard compliant XML was used by the data contract serializer to preserve object references. For maximum compatibility, that was disabled by default. Now the mechanism relies on the standard ID and IDREF attributes which makes the serialized XML consumable by a broader range of targets, but I seems likely to me the default behavior has not been switched to avoid breaking changes for existing systems.

Conclusion

I hope this will answer some of the most frequent questions that arise with serialization using data contracts on the Windows Phone. If you have additional issues you want to have added to the list, feel free to put them in the comments or email me.

Tags: Serialization · Silverlight · Windows Phone 7