views:

220

answers:

3

I am trying to reproduce something that System.Xml.Serialization already does, but for a different source of data. For now task is limited to deserialization only. I.e. given defined source of data that I know how to read. Write a library that takes a random type, learns about it fields/properties via reflection, then generates and compiles "reader" class that can take data source and an instance of that random type and writes from data source into the object's fields/properties.

here is a simplified extract from my ReflectionHelper class

public class ReflectionHelper
{
    public abstract class FieldReader<T> 
    {
        public abstract void Fill(T entity, XDataReader reader);
    }

    public static FieldReader<T> GetFieldReader<T>()
    {
        Type t = typeof(T);
        string className = GetCSharpName(t);
        string readerClassName = Regex.Replace(className, @"\W+", "_") + "_FieldReader";
        string source = GetFieldReaderCode(t.Namespace, className, readerClassName, fields);

        CompilerParameters prms = new CompilerParameters();
        prms.GenerateInMemory = true;
        prms.ReferencedAssemblies.Add("System.Data.dll");
        prms.ReferencedAssemblies.Add(Assembly.GetExecutingAssembly().GetModules(false)[0].FullyQualifiedName);
        prms.ReferencedAssemblies.Add(t.Module.FullyQualifiedName);

        CompilerResults compiled = new CSharpCodeProvider().CompileAssemblyFromSource(prms, new string[] {source});

        if (compiled.Errors.Count > 0)
        {
            StringWriter w = new StringWriter();
            w.WriteLine("Error(s) compiling {0}:", readerClassName);
            foreach (CompilerError e in compiled.Errors)
                w.WriteLine("{0}: {1}", e.Line, e.ErrorText);
            w.WriteLine();
            w.WriteLine("Generated code:");
            w.WriteLine(source);
            throw new Exception(w.GetStringBuilder().ToString());
        }

        return (FieldReader<T>)compiled.CompiledAssembly.CreateInstance(readerClassName);
    }

    private static string GetFieldReaderCode(string ns, string className, string readerClassName, IEnumerable<EntityField> fields)
    {
        StringWriter w = new StringWriter();

        // write out field setters here

        return @"
using System;
using System.Data;

namespace " + ns + @".Generated
{
    public class " + readerClassName + @" : ReflectionHelper.FieldReader<" + className + @">
    {
        public void Fill(" + className + @" e, XDataReader reader)
        {
" + w.GetStringBuilder().ToString() + @"
        }
    }
}
";
    }
}

and the calling code:

class Program
{
    static void Main(string[] args)
    {
        ReflectionHelper.GetFieldReader<Foo>();
        Console.ReadKey(true);
    }

    private class Foo
    {
        public string Field1 = null;
        public int? Field2 = null;
    }
}

The dynamic compilation of course fails because Foo class is not visible outside of Program class. But! The .NET XML deserializer somehow works around that - and the question is: How? After an hour of digging System.Xml.Serialization via Reflector I came to accept that I lack some kind of basic knowledge here and not really sure what am I looking for...

Also it is entirely possible that I am reinventing a wheel and/or digging in a wrong direction, in which case please do speak up!

A: 

Hi

I've been working a bit on this. I'm not sure if it will help but, anyway I think it could be the way. Recently I worked with Serialization and DeSerealization of a class I had to send over the network. As there were two different programs (the client and the server), at first I implemented the class in both sources and then used serialization. It failed as the .Net told me it had not the same ID (I'm not sure but it was some sort of assembly id).

Well, after googling a bit I found that it was because the serialized class was on different assemblies, so the solution was to put that class in a independent library and then compile both client and server with that library. I've used the same idea with your code, so I put both Foo class and FieldReader class in a independent library, let's say:

namespace FooLibrary
{    
    public class Foo
    {
        public string Field1 = null;
        public int? Field2 = null;
    }

    public abstract class FieldReader<T>
    {
        public abstract void Fill(T entity, IDataReader reader);
    }    
}

compile it and add it to the other source (using FooLibrary;)

this is the code I've used. It's not exactly the same as yours, as I don't have the code for GetCSharpName (I used t.Name instead) and XDataReader, so I used IDataReader (just for the compiler to accept the code and compile it) and also change EntityField for object

public class ReflectionHelper
{
    public static FieldReader<T> GetFieldReader<T>()
    {
        Type t = typeof(T);
        string className = t.Name;
        string readerClassName = Regex.Replace(className, @"\W+", "_") + "_FieldReader";
        object[] fields = new object[10];
        string source = GetFieldReaderCode(t.Namespace, className, readerClassName, fields);

        CompilerParameters prms = new CompilerParameters();
        prms.GenerateInMemory = true;
        prms.ReferencedAssemblies.Add("System.Data.dll");
        prms.ReferencedAssemblies.Add(Assembly.GetExecutingAssembly().GetModules(false)[0].FullyQualifiedName);
        prms.ReferencedAssemblies.Add(t.Module.FullyQualifiedName);
        prms.ReferencedAssemblies.Add("FooLibrary1.dll");

        CompilerResults compiled = new CSharpCodeProvider().CompileAssemblyFromSource(prms, new string[] { source });

        if (compiled.Errors.Count > 0)
        {
            StringWriter w = new StringWriter();
            w.WriteLine("Error(s) compiling {0}:", readerClassName);
            foreach (CompilerError e in compiled.Errors)
                w.WriteLine("{0}: {1}", e.Line, e.ErrorText);
            w.WriteLine();
            w.WriteLine("Generated code:");
            w.WriteLine(source);
            throw new Exception(w.GetStringBuilder().ToString());
        }

        return (FieldReader<T>)compiled.CompiledAssembly.CreateInstance(readerClassName);
    }

    private static string GetFieldReaderCode(string ns, string className, string readerClassName, IEnumerable<object> fields)
    {
        StringWriter w = new StringWriter();

        // write out field setters here

        return @"   
using System;   
using System.Data;   
namespace " + ns + ".Generated   
{    
   public class " + readerClassName + @" : FieldReader<" + className + @">    
   {        
         public override void Fill(" + className + @" e, IDataReader reader)          
         " + w.GetStringBuilder().ToString() +         
   }    
  }";        
 } 
}

by the way, I found a tiny mistake, you should use new or override with the Fill method, as it is abstract.

Well, I must admit that GetFieldReader returns null, but at least the compiler compiles it.

Hope that this will help you or at least it guides you to the good answer regards

Sunrisas
The Foo class does not need to be in a separate assembly, it just needs to be public for my original code to work. Same issue applies to your code - change Foo to private at it will break (it will have to be nested in some other class).The layout of the solution in my sample is exactly how it needs to be, otherwise there is no point in dynamic compilation in the first place.Yes, I am aware that `GetFieldReader<T>()` in my sample returns `null` - thats because it passes `readerClassName` to the `CreateInstance()` without namespace... I fixed it since then :)
liho1eye
A: 

If I try to use sgen.exe (the standalone XML serialization assembly compiler), I get the following error message:

Warning: Ignoring 'TestApp.Program'.
  - TestApp.Program is inaccessible due to its protection level. Only public types can be processed.
Warning: Ignoring 'TestApp.Program+Foo'.
  - TestApp.Program+Foo is inaccessible due to its protection level. Only public types can be processed.
Assembly 'c:\...\TestApp\bin\debug\TestApp.exe' does not contain any types that can be serialized using XmlSerializer.

Calling new XmlSerializer(typeof(Foo)) in your example code results in:

System.InvalidOperationException: TestApp.Program+Foo is inaccessible due to its protection level. Only public types can be processed.

So what gave you the idea that XmlSerializer can handle this?

However, remember that at runtime, there are no such restrictions. Trusted code using reflection is free to ignore access modifiers. This is what .NET binary serialization is doing.

For example, if you generate IL code at runtime using DynamicMethod, then you can pass skipVisibility = true to avoid any checks for visibility of fields/classes.

Daniel
bollox you right... I can't believe I thought it did that... I guess I can live with this restriction then. Here, have some rep :)Edit: hmm it wont let me award the bounty - tells me to wait 6 hours
liho1eye
Please consider looking at the other answers before deciding who to award the bounty to.
Timwi
there are no other answers simply because the question is made with a wrong assumption. Daniel is 110% correct. He did exactly what a good developer should before trying to fix a problem (and which I failed to do properly): verify the evidence of the issue. In my opinion he absolutely deserves it.
liho1eye
+1  A: 

You don’t need to create a dynamic assembly and dynamically compile code in order to deserialise an object. XmlSerializer does not do that either — it uses the Reflection API, in particular it uses the following simple concepts:

Retrieving the set of fields from any type

Reflection provides the GetFields() method for this purpose:

foreach (var field in myType.GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
    // ...

I’m including the BindingFlags parameter here to ensure that it will include non-public fields, because otherwise it will return only public ones by default.

Setting the value of a field in any type

Reflection provides the function SetValue() for this purpose. You call this on a FieldInfo instance (which is returned from GetFields() above) and give it the instance in which you want to change the value of that field, and the value to set it to:

field.SetValue(myObject, myValue);

This is basically equivalent to myObject.Field = myValue;, except of course that the field is identified at runtime instead of compile-time.

Putting it all together

Here is a simple example. Notice you need to extend this further to work with more complex types such as arrays, for example.

public static T Deserialize<T>(XDataReader dataReader) where T : new()
{
    return (T) deserialize(typeof(T), dataReader);
}
private static object deserialize(Type t, XDataReader dataReader)
{
    // Handle the basic, built-in types
    if (t == typeof(string))
        return dataReader.ReadString();
    // etc. for int and all the basic types

    // Looks like the type t is not built-in, so assume it’s a class.
    // Create an instance of the class
    object result = Activator.CreateInstance(t);

    // Iterate through the fields and recursively deserialize each
    foreach (var field in t.GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
        field.SetValue(result, deserialize(field.FieldType, dataReader));

    return result;
}

Notice I had to make some assumptions about XDataReader, most notably that it can just read a string like that. I’m sure you’ll be able to change it so that it works with your particular reader class.

Once you’ve extended this to support all the types you need (including int? in your example class), you can deserialize an object by calling:

Foo myFoo = Deserialize<Foo>(myDataReader);

and you can do this even when Foo is a private type as it is in your example.

Timwi
no, it uses reflection to obtain information about type's members and generate deserializer class. That is exactly how the last parameter for my `GetFieldReaderCode()` is obtained. The main reason why it does not use reflection to assign member values is because reflection is slow. Deserializing a 1k objects this way would be a serious drag (especially types with a lot of members)
liho1eye
Interesting. I didn’t know that. It is very slow in generating that dynamic assembly though. My own Reflection-based XML serializer is about 10 times faster for a single object; XmlSerializer is only worth it if you’re going to deserialise the same type more than 25 times. (Tested on an object graph with a total of 335 fields, XML output ~20 KB.)
Timwi
yep, thats is the expectation. There is obviously a price to initial investment of generation, compiling and loading the assembly, but that is acceptable. It is also expected that my sample code might be very suboptimal - its just a proof of concept.
liho1eye