views:

269

answers:

6

I have a class library containing several structures each consisting of several value and reference types. Most of the value types are mandatory, a few value types and all reference types are optional. All structures are XmlSerializable (which is mandatory).

As far as the class library is targeted to mobile devices I want to reduce the memory footprint. My first idea was to use Nullable<T> for the value types, but this increases the memory size by 4 bytes per Nullable<T>. My second idea is to pack all optional value types into a separate structure that is only instantiated when any of its members is needed. But this would force me to implement IXmlSerializable on the "main" structure.

Are there any other approaches to "shrink" the structures?

[EDIT]

Beg your pardon for this bad question. I think I have to clarify some things and get more specific:

The class library is designed to serialize data info GPX (GPS Exchange Format). The structures are e.g. Waypoint or Track. They have mandatory fields as latitude, longitude etc. Optional fields are Vertical/Horizontal/Position Dilution of Precision, a description, a link.

The library is mainly targeted to Mobile devices such as PDAs. RAM is short, but plenty of non-volatile Memory is available.

Code examples cannot be shown as far as there are none. I want to think about several pitfalls before starting the implementation.

+3  A: 

For some serious fun: apply Flyweight and store all instances in a bitmap? With a small memory device you don't need 4 byte pointers.

[Edit] With Flyweight, you can have a separate storage strategy for each field. I do not suggest to directly store the string value in the bitmap, but you could store an index.

The type is not stored in the bitmap, but in the unique object factory.

Stephan Eggermont
How in your bitmap model are you indicating the type of the stored data? Are you parsing the block as you go along?. It is feasible to use the mode you describe but is tricky. Storing strings for example.
ShuggyCoUk
I merged your idea on the WithOne/WithTwo etc into mine and made community wiki. If you feel it can be improved further feel free.
ShuggyCoUk
A: 

Some sort of binary serialization will often do much better than XML serialization. You'll have to try it out for your specific data structures to see if you gain much.

Check out MSDN an example using BinaryFormatter.

Larsenal
XmlSerialization is the only purpose of this class library-
PVitt
Thanks for clarifying.
Larsenal
A: 

Build your own serializing in order to minimize your structure. And serialize to binary and not xml.

Something along the lines of:

internal void Save(BinaryWriter w)
{
    w.Write(this.id);
    w.Write(this.name);
    byte[] bytes = Encoding.UTF8.GetBytes(this.MyString);
    w.Write(bytes.Length);
    w.Write(bytes);

    w.Write(this.tags.Count); // nested struct/class        
    foreach (Tag tag in this.tags)
    {
        tag.Save(w);
    }
}

and have a constructor which builds it back up

public MyClass(BinaryReader reader)
{
   this.id = reader.ReadUInt32();
   etc.

}
Mikael Svenson
+2  A: 

It is probably good to know that the XmlSerializer doesn't care about your internal object layout, it only cares about your public fields and properties. You can hide the internal memory optimizations behind your property accessors, and the XmlSerializer wouldn't even know.

For instance, if you know that you usually have only 2 references set, but on occasion more, you can store the two frequent ones as part of your main object, and hide the infrequent ones inside an object[] or ListDictionary or a specialized private class of your own making. However, be careful that each indirect container object also contains overhead, as it needs to be a reference type. Or when you have 8 nullable integers as part of your public contract, internally you could use 8 regular integers and a single byte containing the is-this-int-null status as its bits.

If you want to specialize even further, perhaps create specialized subclasses depending on the available data, you would have to go the route of IXmlSerializable, but usually that's not really needed.

Ruben
+8  A: 

Here is a technique to aggressively reduce in memory overhead whilst allowing Xml Serialization.
Update: the orignal inline linked list idea is more efficient for 1 and 2 entries than a standard list with count construct but the use of fixed size optionals for zero, one and two cases is even more efficient.

Proviso:

This is predicated on you knowing that you really do need to shave the memory, as such (since you haven't done any coding yet) this may well be a massively premature optimization.

Also this design is predicated on the optional fields being very rare.

I use double as a 'placeholder' whatever format best allows you to represent the precision/units involved should be used.

public class WayPoint
{
  // consumes IntPtr.Size fixed cost 
  private IOptional optional = OptionalNone.Default; 

  public double Latitude  { get; set; }
  public double Longitude { get; set; }

  public double Vertical 
  { 
    get { return optional.Get<double>("Vertical") ?? 0.0; }
    set { optional = optional.Set<double>("Vertical", value); } 
  }

  [XmlIgnore] // need this pair for every value type  
  public bool VerticalSpecified 
  { 
    get { return optional.Get<double>("Vertical").HasValue;  } 
  }         
  public void ClearVertical()
  {
    optional = optional.Clear<double>("Vertical");   
  }

  public string Description // setting to null clears it
  { 
    get { return optional.GetRef<string>("Description"); }
    set { optional = optional.SetRef<string>("Description", value); } 
  }

  // Horizontal, Position, DilutionOfPrecision etc.
}

The real heavy lifting is done here:

internal interface IOptional
{
  T? Get<T>(string id) where T : struct;
  T GetRef<T>(string id) where T : class;

  IOptional Set<T>(string id, T value);
  IOptional Clear(string id);
}

internal sealed class OptionalNone : IOptional
{
  public static readonly OptionalNone Default = new OptionalNone();

  public T? Get<T>(string id)  where T : struct
  { 
    return null;
  }

  public T GetRef<T>(string id)  where T : class
  {
    return null;
  }

  public IOptional Set<T>(string id, T value)
  {
    if (value == null)
      return Clear(id);
    return new OptionalWithOne<T>(id, value);
  }

  public IOptional Clear(string id)
  {
    return this; // no effect
  }
}

The fixed size ones become more interesting to write, there is no point writing these as structs as they would be boxed to be placed within the IOptional field within the WayPoint class.

internal sealed class OptionalWithOne<X> : IOptional
{
  private string id1;
  private X value1;

  public OptionalWithOne(string id, X value)
  {
    this.id1 = id;
    this.value1 = value;
  }

  public T? Get<T>(string id)  where T : struct
  { 
    if (string.Equals(id, this.id1))
      return (T)(object)this.value1;
    return null;
  }

  public T GetRef<T>(string id)  where T : class        
  {
    if (string.Equals(id, this.id1))
      return (T)(object)this.value1;
    return null;
  }

  public IOptional Set<T>(string id, T value)
  {
    if (string.Equals(id, this.id1))
    {
      if (value == null)
        return OptionalNone.Default;
      this.value1 = (X)(object)value;
      return this;
    }
    else
    {
      if (value == null)
        return this;
      return new OptionalWithTwo<X,T>(this.id1, this.value1, id, value);
    }
  }

  public IOptional Clear(string id)
  {
    if (string.Equals(id, this.id1))
      return OptionalNone.Default;
    return this; // no effect
  }
}

Then for two (you can extend this idea as far as you want but as you can see the code gets unpleasant quickly.

internal sealed class OptionalWithTwo<X,Y> : IOptional
{
  private string id1;
  private X value1;
  private string id2;
  private Y value2;

  public OptionalWithTwo(
    string id1, X value1,
    string id2, Y value2)
  {
    this.id1 = id1;
    this.value1 = value1;
    this.id2 = id2;
    this.value2 = value2;
  }

  public T? Get<T>(string id)  where T : struct
  { 
    if (string.Equals(id, this.id1))
      return (T)(object)this.value1;
    if (string.Equals(id, this.id2))
      return (T)(object)this.value2;
    return null;
  }

  public T GetRef<T>(string id)  where T : class        
  {
    if (string.Equals(id, this.id1))
      return (T)(object)this.value1;
    if (string.Equals(id, this.id2))
      return (T)(object)this.value2;
    return null;
  }

  public IOptional Set<T>(string id, T value)
  {
    if (string.Equals(id, this.id1))
    {
      if (value == null)
        return Clear(id);
      this.value1 = (X)(object)value;
      return this;
    }
    else if (string.Equals(id, this.id2))
    {
      if (value == null)
        return Clear(id);
      this.value2 = (Y)(object)value;
      return this;
    }
    else
    {
      if (value == null)
        return this;
      return new OptionalWithMany(
        this.id1, this.value1,
        this.id2, this.value2,
        id, value);
    }
  }

  public IOptional Clear(string id)
  {
    if (string.Equals(id, this.id1))
      return new OptionalWithOne<Y>(this.id2, this.value2);
    if (string.Equals(id, this.id2))
      return new OptionalWithOne<X>(this.id1, this.value1);
    return this; // no effect
  }
}

Before finally ending with the relatively inefficient

internal sealed class OptionalWithMany : IOptional
{
  private List<string> ids = new List<string>();
  // this boxes, if you had a restricted set of data types 
  // you could do a per type list and map between them
  // it is assumed that this is sufficiently uncommon that you don't care
  private List<object> values = new List<object>();

  public OptionalWithMany(
    string id1, object value1,
    string id2, object value2,
    string id3, object value3)
  {
    this.ids.Add(id1);
    this.values.Add(value1);
    this.ids.Add(id2);
    this.values.Add(value2);
    this.ids.Add(id3);
    this.values.Add(value3);
  }

  public T? Get<T>(string id)  where T : struct
  { 
    for (int i= 0; i < this.values.Count;i++)
    { 
      if (string.Equals(id, this.ids[i]))
        return (T)this.values[i];
    }
    return null;
  }

  public T GetRef<T>(string id)  where T : class        
  {
    for (int i= 0; i < this.values.Count;i++)
    { 
      if (string.Equals(id, this.ids[i]))
        return (T)this.values[i];
    }
    return null;
  }

  public IOptional Set<T>(string id, T value)
  {
    for (int i= 0; i < this.values.Count;i++)
    { 
      if (string.Equals(id, this.ids[i]))
      {
        if (value == null)
          return Clear(id);           
        this.values[i] = value;
        return this;
      }
    }
    if (value != null)
    {
      this.ids.Add(id);
      this.values.Add(value);
    }  
    return this;  
  }

  public IOptional Clear(string id)
  {
    for (int i= 0; i < this.values.Count;i++)
    { 
      if (string.Equals(id, this.ids[i]))
      {
        this.ids.RemoveAt(i);
        this.values.RemoveAt(i);
        return ShrinkIfNeeded();
      }
    }
    return this; // no effect
  }

  private IOptional ShrinkIfNeeded()
  {
    if (this.ids.Count == 2)
    {
      //return new OptionalWithTwo<X,Y>(
      // this.ids[0], this.values[0],
      //  this.ids[1], this.values[1]);
      return (IOptional)
        typeof(OptionalWithTwo<,>).MakeGenericType(
          // this is a bit risky. 
          // your value types may not use inhertence
          this.values[0].GetType(),
          this.values[1].GetType())
        .GetConstructors().First().Invoke(
          new object[] 
          {
            this.ids[0], this.values[0],
            this.ids[1], this.values[1]
          });
    }
    return this;
  }
}

OptionalWithMany could be written rather better than this but it gives you the idea. With restricted type support you could do a global Key -> value map per type 'heap' like so:

internal struct Key
{ 
    public readonly OptionalWithMany;
    public readonly string Id;
    // define equality and hashcode as per usual
}

Then simply store the list of Id's currently in use within OptionalToMany. Shrinking would be slightly more complex (but better from a type point of view since you would scan each global 'heap' till you found the matching entry and use the type of the heap to construct the OptionalWithTwo. This would allow polymorphism in the property values.

Regardless of the internals the primary benefit of this is that the public surface of the WayPoint class hides all this entirely.

You can then set up the class however you want for serialization though attributes, IXmlSerializable (which would remove the need for the annoying xxxSpecified properties).

I used strings as Id's for simplicity in my example.
If you really care about size and speed you should change the Id's to be enumerations. Given packing behaviour this won't save you much even if you can fit all needed values into a byte but it would give you compile time sanity checking. The strings are all compile time constants so occupy next to no space (but are slower for checking equality).

I urge you to only do something like this after you check that it is needed. The plus side is that this does not limit your xml serialization so you can mould it to whatever format you desire. Also the public face of the 'data packet' can be kept clean (except for the xxxSpecified junk).

If you want to avoid the xxxSpecified hassles and you know you have some 'out of band' values you can use the following trick:

[DefaultValue(double.MaxValue)]
public double Vertical 
{ 
    get { return optional.Get<double>("Vertical") ?? double.MaxValue; }
    set { optional = optional.Set<double>("Vertical", value); } 
}

public void ClearVertical()
{
    optional = optional.ClearValue<double>("Vertical");   
}

However the rest of you API must be capable of detecting these special values. In general I would say that the specified route is better.

If a particular set of properties become 'always available' on certain devices, or in certain modes you should switch to alternate classes where the properties for those are simple ones. Since the xml form will be identical this means they can interoperate simply and easily but memory usage in those cases will be much less.

If the number of these groups becomes large you may even consider a code-gen scenario (at runtime even, though this increases your support burden considerably)

ShuggyCoUk
would the -1 like to say why?
ShuggyCoUk
-1 You didn't try to measure this, did you? It's allocating too many objects for those optional attributes. It takes too much memory.
Stephan Eggermont
In the waypoint, you know how many optionals you have. A better design would be to allocate one block large enough to hold all optional values. Just reallocate when you need a larger or smaller block.
Stephan Eggermont
Or delegate to class side
Stephan Eggermont
I made the point right up at the top that this was assuming the additional bits were very rare, thus consuming only one intptr for no additional comments. Storing with count (and doing the equivalent of List<> but with key value pairs is also possible. but this has slightly higher cost in the case where you have only 1 additional value. Going for a self managed 'heap' for the additional ones means you could go to ushort based adressing (at cost of only 64K additional attributes globally).
ShuggyCoUk
As to measuring It's pointless without more data on how many additional attributes will exist and how they will be distributed.I like your solution in general but note that it totally fails for the indicated need which is to *Xml serialize* the data with minimal memory.
ShuggyCoUk
You don't store with count for a small count, you use inheritance.OptionalWithOne, OptionalWithTwo, etc.
Stephan Eggermont
Now that I like, integrating and making CW
ShuggyCoUk
This radically changes the internals, though not the core idea of keeping the 'surface' like it was a normal class. Hopefully any previous up votes received do not feel that this is no longer such a good idea...
ShuggyCoUk
-1 -> +1 Much better implementation
Stephan Eggermont
I'd send half the points your way if I could, I felt cw was the best option
ShuggyCoUk
+1  A: 

You can do a couple of things:

  • Make sure to use the smallest type possible for a particular value. For example, if you look at the schema, dgpsStationType has a min value of 0, and a max of 1023. This can be stored as a ushort. Reduce the size of these items when possible.

  • Make sure that your fields are 4-byte aligned. The end resulting size of your structure will be some multiple of 4-bytes in size (assuming 32-bit). The default layout for a class has the items stored sequentially. If the fields are not packed correctly, the compiler will have wasted space making sure that your fields are 4-byte aligned. You can specify the layout explicitly using StructLayoutAttribute.

Bad Example: these fields in a class take up 12 bytes. The int must take up 4 contiguous bytes, and the other members must be 4-byte aligned.

public class Bad {
   byte a;
   byte b;
   int c;
   ushort u; 
}

Better Example: these fields in a class take up 8 bytes. These fields are packed efficiently.

public class Better {
   byte a;
   byte b;
   ushort u;
   int c;
}
  • Reduce the size of your object graph. Each reference type takes up 8 bytes of overhead. If you've got a deep graph, that's a lot of overhead. Pull everything you can into functions that operate on data in you main class. Think more 'C' like, and less OOD.

  • Its still a good idea to lazy-load some optional parameters, but you should draw your line clearly. Create 1 or maybe 2 sets of 'optional' values that can be loaded or null. Each set will mandate a reference type, and its overhead.

  • Use structs where you can. Be careful of value-type semantics though, they can be tricky.

  • Consider not implementing ISerializable. Interface methods are by definition virtual. Any class with virtual methods contains a reference to a vtable (another 4 bytes). Instead implement xml serialization manually in an external class.

Matt Brunell