ansaurus

Question

Answer 1

+3 A:

For some serious fun: apply Flyweight and store all instances in a bitmap? With a small memory device you don't need 4 byte pointers.

[Edit] With Flyweight, you can have a separate storage strategy for each field. I do not suggest to directly store the string value in the bitmap, but you could store an index.

The type is not stored in the bitmap, but in the unique object factory.

Stephan Eggermont 2009-09-04 21:15:38

How in your bitmap model are you indicating the type of the stored data? Are you parsing the block as you go along?. It is feasible to use the mode you describe but is tricky. Storing strings for example.

ShuggyCoUk 2009-09-06 13:15:37

I merged your idea on the WithOne/WithTwo etc into mine and made community wiki. If you feel it can be improved further feel free.

ShuggyCoUk 2009-09-07 09:03:47

Answer 2

A:

Some sort of binary serialization will often do much better than XML serialization. You'll have to try it out for your specific data structures to see if you gain much.

Check out MSDN an example using BinaryFormatter.

Larsenal 2009-09-04 21:21:45

XmlSerialization is the only purpose of this class library-

PVitt 2009-09-04 21:31:42

Thanks for clarifying.

Larsenal 2009-09-04 21:35:07

Answer 3

A:

Build your own serializing in order to minimize your structure. And serialize to binary and not xml.

Something along the lines of:

internal void Save(BinaryWriter w)
{
    w.Write(this.id);
    w.Write(this.name);
    byte[] bytes = Encoding.UTF8.GetBytes(this.MyString);
    w.Write(bytes.Length);
    w.Write(bytes);

    w.Write(this.tags.Count); // nested struct/class        
    foreach (Tag tag in this.tags)
    {
        tag.Save(w);
    }
}

and have a constructor which builds it back up

public MyClass(BinaryReader reader)
{
   this.id = reader.ReadUInt32();
   etc.

}

Mikael Svenson 2009-09-04 21:27:07

Answer 4

+2 A:

It is probably good to know that the XmlSerializer doesn't care about your internal object layout, it only cares about your public fields and properties. You can hide the internal memory optimizations behind your property accessors, and the XmlSerializer wouldn't even know.

For instance, if you know that you usually have only 2 references set, but on occasion more, you can store the two frequent ones as part of your main object, and hide the infrequent ones inside an object[] or ListDictionary or a specialized private class of your own making. However, be careful that each indirect container object also contains overhead, as it needs to be a reference type. Or when you have 8 nullable integers as part of your public contract, internally you could use 8 regular integers and a single byte containing the is-this-int-null status as its bits.

If you want to specialize even further, perhaps create specialized subclasses depending on the available data, you would have to go the route of IXmlSerializable, but usually that's not really needed.

Ruben 2009-09-04 21:40:32

Answer 5

+8 A:

Here is a technique to aggressively reduce in memory overhead whilst allowing Xml Serialization.
Update: the orignal inline linked list idea is more efficient for 1 and 2 entries than a standard list with count construct but the use of fixed size optionals for zero, one and two cases is even more efficient.

Proviso:

This is predicated on you knowing that you really do need to shave the memory, as such (since you haven't done any coding yet) this may well be a massively premature optimization.

Also this design is predicated on the optional fields being very rare.

I use double as a 'placeholder' whatever format best allows you to represent the precision/units involved should be used.

public class WayPoint
{
  // consumes IntPtr.Size fixed cost 
  private IOptional optional = OptionalNone.Default; 

  public double Latitude  { get; set; }
  public double Longitude { get; set; }

  public double Vertical 
  { 
    get { return optional.Get<double>("Vertical") ?? 0.0; }
    set { optional = optional.Set<double>("Vertical", value); } 
  }

  [XmlIgnore] // need this pair for every value type  
  public bool VerticalSpecified 
  { 
    get { return optional.Get<double>("Vertical").HasValue;  } 
  }         
  public void ClearVertical()
  {
    optional = optional.Clear<double>("Vertical");   
  }

  public string Description // setting to null clears it
  { 
    get { return optional.GetRef<string>("Description"); }
    set { optional = optional.SetRef<string>("Description", value); } 
  }

  // Horizontal, Position, DilutionOfPrecision etc.
}

The real heavy lifting is done here:

internal interface IOptional
{
  T? Get<T>(string id) where T : struct;
  T GetRef<T>(string id) where T : class;

  IOptional Set<T>(string id, T value);
  IOptional Clear(string id);
}

internal sealed class OptionalNone : IOptional
{
  public static readonly OptionalNone Default = new OptionalNone();

  public T? Get<T>(string id)  where T : struct
  { 
    return null;
  }

  public T GetRef<T>(string id)  where T : class
  {
    return null;
  }

  public IOptional Set<T>(string id, T value)
  {
    if (value == null)
      return Clear(id);
    return new OptionalWithOne<T>(id, value);
  }

  public IOptional Clear(string id)
  {
    return this; // no effect
  }
}

The fixed size ones become more interesting to write, there is no point writing these as structs as they would be boxed to be placed within the IOptional field within the WayPoint class.

internal sealed class OptionalWithOne<X> : IOptional
{
  private string id1;
  private X value1;

  public OptionalWithOne(string id, X value)
  {
    this.id1 = id;
    this.value1 = value;
  }

  public T? Get<T>(string id)  where T : struct
  { 
    if (string.Equals(id, this.id1))
      return (T)(object)this.value1;
    return null;
  }

  public T GetRef<T>(string id)  where T : class        
  {
    if (string.Equals(id, this.id1))
      return (T)(object)this.value1;
    return null;
  }

  public IOptional Set<T>(string id, T value)
  {
    if (string.Equals(id, this.id1))
    {
      if (value == null)
        return OptionalNone.Default;
      this.value1 = (X)(object)value;
      return this;
    }
    else
    {
      if (value == null)
        return this;
      return new OptionalWithTwo<X,T>(this.id1, this.value1, id, value);
    }
  }

  public IOptional Clear(string id)
  {
    if (string.Equals(id, this.id1))
      return OptionalNone.Default;
    return this; // no effect
  }
}

Then for two (you can extend this idea as far as you want but as you can see the code gets unpleasant quickly.

internal sealed class OptionalWithTwo<X,Y> : IOptional
{
  private string id1;
  private X value1;
  private string id2;
  private Y value2;

  public OptionalWithTwo(
    string id1, X value1,
    string id2, Y value2)
  {
    this.id1 = id1;
    this.value1 = value1;
    this.id2 = id2;
    this.value2 = value2;
  }

  public T? Get<T>(string id)  where T : struct
  { 
    if (string.Equals(id, this.id1))
      return (T)(object)this.value1;
    if (string.Equals(id, this.id2))
      return (T)(object)this.value2;
    return null;
  }

  public T GetRef<T>(string id)  where T : class        
  {
    if (string.Equals(id, this.id1))
      return (T)(object)this.value1;
    if (string.Equals(id, this.id2))
      return (T)(object)this.value2;
    return null;
  }

  public IOptional Set<T>(string id, T value)
  {
    if (string.Equals(id, this.id1))
    {
      if (value == null)
        return Clear(id);
      this.value1 = (X)(object)value;
      return this;
    }
    else if (string.Equals(id, this.id2))
    {
      if (value == null)
        return Clear(id);
      this.value2 = (Y)(object)value;
      return this;
    }
    else
    {
      if (value == null)
        return this;
      return new OptionalWithMany(
        this.id1, this.value1,
        this.id2, this.value2,
        id, value);
    }
  }

  public IOptional Clear(string id)
  {
    if (string.Equals(id, this.id1))
      return new OptionalWithOne<Y>(this.id2, this.value2);
    if (string.Equals(id, this.id2))
      return new OptionalWithOne<X>(this.id1, this.value1);
    return this; // no effect
  }
}

Before finally ending with the relatively inefficient

internal sealed class OptionalWithMany : IOptional
{
  private List<string> ids = new List<string>();
  // this boxes, if you had a restricted set of data types 
  // you could do a per type list and map between them
  // it is assumed that this is sufficiently uncommon that you don't care
  private List<object> values = new List<object>();

  public OptionalWithMany(
    string id1, object value1,
    string id2, object value2,
    string id3, object value3)
  {
    this.ids.Add(id1);
    this.values.Add(value1);
    this.ids.Add(id2);
    this.values.Add(value2);
    this.ids.Add(id3);
    this.values.Add(value3);
  }

  public T? Get<T>(string id)  where T : struct
  { 
    for (int i= 0; i < this.values.Count;i++)
    { 
      if (string.Equals(id, this.ids[i]))
        return (T)this.values[i];
    }
    return null;
  }

  public T GetRef<T>(string id)  where T : class        
  {
    for (int i= 0; i < this.values.Count;i++)
    { 
      if (string.Equals(id, this.ids[i]))
        return (T)this.values[i];
    }
    return null;
  }

  public IOptional Set<T>(string id, T value)
  {
    for (int i= 0; i < this.values.Count;i++)
    { 
      if (string.Equals(id, this.ids[i]))
      {
        if (value == null)
          return Clear(id);           
        this.values[i] = value;
        return this;
      }
    }
    if (value != null)
    {
      this.ids.Add(id);
      this.values.Add(value);
    }  
    return this;  
  }

  public IOptional Clear(string id)
  {
    for (int i= 0; i < this.values.Count;i++)
    { 
      if (string.Equals(id, this.ids[i]))
      {
        this.ids.RemoveAt(i);
        this.values.RemoveAt(i);
        return ShrinkIfNeeded();
      }
    }
    return this; // no effect
  }

  private IOptional ShrinkIfNeeded()
  {
    if (this.ids.Count == 2)
    {
      //return new OptionalWithTwo<X,Y>(
      // this.ids[0], this.values[0],
      //  this.ids[1], this.values[1]);
      return (IOptional)
        typeof(OptionalWithTwo<,>).MakeGenericType(
          // this is a bit risky. 
          // your value types may not use inhertence
          this.values[0].GetType(),
          this.values[1].GetType())
        .GetConstructors().First().Invoke(
          new object[] 
          {
            this.ids[0], this.values[0],
            this.ids[1], this.values[1]
          });
    }
    return this;
  }
}

OptionalWithMany could be written rather better than this but it gives you the idea. With restricted type support you could do a global Key -> value map per type 'heap' like so:

internal struct Key
{ 
    public readonly OptionalWithMany;
    public readonly string Id;
    // define equality and hashcode as per usual
}

Then simply store the list of Id's currently in use within OptionalToMany. Shrinking would be slightly more complex (but better from a type point of view since you would scan each global 'heap' till you found the matching entry and use the type of the heap to construct the OptionalWithTwo. This would allow polymorphism in the property values.

Regardless of the internals the primary benefit of this is that the public surface of the WayPoint class hides all this entirely.

You can then set up the class however you want for serialization though attributes, IXmlSerializable (which would remove the need for the annoying xxxSpecified properties).

I used strings as Id's for simplicity in my example.
If you really care about size and speed you should change the Id's to be enumerations. Given packing behaviour this won't save you much even if you can fit all needed values into a byte but it would give you compile time sanity checking. The strings are all compile time constants so occupy next to no space (but are slower for checking equality).

I urge you to only do something like this after you check that it is needed. The plus side is that this does not limit your xml serialization so you can mould it to whatever format you desire. Also the public face of the 'data packet' can be kept clean (except for the xxxSpecified junk).

If you want to avoid the xxxSpecified hassles and you know you have some 'out of band' values you can use the following trick:

[DefaultValue(double.MaxValue)]
public double Vertical 
{ 
    get { return optional.Get<double>("Vertical") ?? double.MaxValue; }
    set { optional = optional.Set<double>("Vertical", value); } 
}

public void ClearVertical()
{
    optional = optional.ClearValue<double>("Vertical");   
}

However the rest of you API must be capable of detecting these special values. In general I would say that the specified route is better.

If a particular set of properties become 'always available' on certain devices, or in certain modes you should switch to alternate classes where the properties for those are simple ones. Since the xml form will be identical this means they can interoperate simply and easily but memory usage in those cases will be much less.

If the number of these groups becomes large you may even consider a code-gen scenario (at runtime even, though this increases your support burden considerably)

ShuggyCoUk 2009-09-04 23:54:05

would the -1 like to say why?

ShuggyCoUk 2009-09-06 10:12:51

-1 You didn't try to measure this, did you? It's allocating too many objects for those optional attributes. It takes too much memory.

Stephan Eggermont 2009-09-06 10:15:48

In the waypoint, you know how many optionals you have. A better design would be to allocate one block large enough to hold all optional values. Just reallocate when you need a larger or smaller block.

Stephan Eggermont 2009-09-06 10:31:02

Or delegate to class side

Stephan Eggermont 2009-09-06 10:32:04

I made the point right up at the top that this was assuming the additional bits were very rare, thus consuming only one intptr for no additional comments. Storing with count (and doing the equivalent of List<> but with key value pairs is also possible. but this has slightly higher cost in the case where you have only 1 additional value. Going for a self managed 'heap' for the additional ones means you could go to ushort based adressing (at cost of only 64K additional attributes globally).

ShuggyCoUk 2009-09-06 12:40:41

As to measuring It's pointless without more data on how many additional attributes will exist and how they will be distributed.I like your solution in general but note that it totally fails for the indicated need which is to *Xml serialize* the data with minimal memory.

ShuggyCoUk 2009-09-06 12:42:18

You don't store with count for a small count, you use inheritance.OptionalWithOne, OptionalWithTwo, etc.

Stephan Eggermont 2009-09-07 07:24:56

Now that I like, integrating and making CW

ShuggyCoUk 2009-09-07 07:46:59

This radically changes the internals, though not the core idea of keeping the 'surface' like it was a normal class. Hopefully any previous up votes received do not feel that this is no longer such a good idea...

ShuggyCoUk 2009-09-07 09:02:38

-1 -> +1 Much better implementation

Stephan Eggermont 2009-09-07 11:25:47

I'd send half the points your way if I could, I felt cw was the best option

ShuggyCoUk 2009-09-07 14:14:36

Answer 6

+1 A:

You can do a couple of things:

Make sure to use the smallest type possible for a particular value. For example, if you look at the schema, dgpsStationType has a min value of 0, and a max of 1023. This can be stored as a ushort. Reduce the size of these items when possible.
Make sure that your fields are 4-byte aligned. The end resulting size of your structure will be some multiple of 4-bytes in size (assuming 32-bit). The default layout for a class has the items stored sequentially. If the fields are not packed correctly, the compiler will have wasted space making sure that your fields are 4-byte aligned. You can specify the layout explicitly using StructLayoutAttribute.

Bad Example: these fields in a class take up 12 bytes. The int must take up 4 contiguous bytes, and the other members must be 4-byte aligned.

public class Bad {
   byte a;
   byte b;
   int c;
   ushort u; 
}

Better Example: these fields in a class take up 8 bytes. These fields are packed efficiently.

public class Better {
   byte a;
   byte b;
   ushort u;
   int c;
}

Reduce the size of your object graph. Each reference type takes up 8 bytes of overhead. If you've got a deep graph, that's a lot of overhead. Pull everything you can into functions that operate on data in you main class. Think more 'C' like, and less OOD.
Its still a good idea to lazy-load some optional parameters, but you should draw your line clearly. Create 1 or maybe 2 sets of 'optional' values that can be loaded or null. Each set will mandate a reference type, and its overhead.
Use structs where you can. Be careful of value-type semantics though, they can be tricky.
Consider not implementing ISerializable. Interface methods are by definition virtual. Any class with virtual methods contains a reference to a vtable (another 4 bytes). Instead implement xml serialization manually in an external class.

Matt Brunell 2009-09-07 08:25:36

ansaurus

tags:

views:

answers:

How to reduce structure size

related questions