views:

35

answers:

1

Consider this code:

class MyClass {
    string PropertyA;
    int PropertyB;
    double PropertyC;
    object PropertyD;
    static ComparisonResult Compare(MyClass a, MyClass b){
        // returns a ComparisonResult with
        // _sampleElement = a
        // _commonProperties = flags that describe the common properties of a and b
    }
}

enum SimilarityFlags {
    SharedPropertyA = 1,
    SharedPropertyB = 2,
    SharedPropertyC = 4,
    SharedPropertyD = 8
}

class ComparisonResult {
    private MyClass _sampleElement;
    private SimilarityFlags _commonProperties;

    bool Equals(object obj){
        ComparisonResult other = obj as ComparisonResult;
        if(other==null) return false;
        if(this._commonProperties != other._commonProperties) return false;
        MyClass s1 = this._sampleElement;
        MyClass s2 = other._sampleElement;
        if(_commonProperties.HasFlag(SimilarityFlags.SharedPropertyA) && s1.PropertyA != s2.PropertyA) return false; 
        if(_commonProperties.HasFlag(SimilarityFlags.SharedPropertyB) && s1.PropertyB != s2.PropertyB) return false; 
        if(_commonProperties.HasFlag(SimilarityFlags.SharedPropertyC) && s1.PropertyC != s2.PropertyC) return false; 
        if(_commonProperties.HasFlag(SimilarityFlags.SharedPropertyD) && s1.PropertyD != s2.PropertyD) return false; 
        return true;
    }

    int GetHashCode(){
        return (int)_commonProperties;
    }
}



MyClass[] array;
HashSet<ComparisonResult> possibleValues = GetAllPossibleComparisonValues(array);

How can I get all the possible values that Compare returns when it takes any two elements in the array?

Note: Compare(a, b) == Compare(b, a) and a != b

Example (pseudocode, 3 properties instead of 4):

GetAllPossibleComparisonValues( [  {"foo", 5, 0x00}, {"foo", 77, 0x00}, {"BAR", 5, 0x00}, {"foo", 5, 0x00}, {"BAR", 5, 0x00}  ] )

should return this set: [ {any, any, 0x00}, {"foo", any, 0x00}, {"foo", 5, 0x00}, {"BAR", 5, 0x00}, {any, 5, 0x00} ]

GetAllPossibleComparisonValues( [ {"foobar", 1}, {"foobar", 2},  {"foobar", 3},  {"foobar", 4} ])

should return [ {"foobar", any} ]

Currently, I'm using this algorithm:

for(int i = 0; i < array.Length - 1; i++){
    for(int j = i + 1; i < array.Length; j++){
        possibleValues.Add(MyClass.Compare(array[i], array[j]));
    }
}

but it is very inefficient, especially with long arrays where any two elements have the same ComparisonResult. After the computation, possibleValues.Count is usually very small (1..3), even for long arrays (2000+ elements).

I think it is possible to greatly improve the efficiency of the computation. For example, if Compare(array[0], array[1]) == Compare(array[0], array[2]), there's no need to call Compare(array[1], array[2])

How can I do?

A: 

This problem seems to bee the boolean satisfiability problem. If you model each value of your array as an boolean input variable, it might be possible to use the algorithms listed in the wikipedia page to get all input vectors that satisfy a certain output variable. The effort is not low, so it depends if you realy need this speedup or if you can live with your allready working solution since this is very easy to understand.

Another thing could be that you cache your allready found solutions and only modify those found vectors if i.e. a new value in one array has been added. That way you only need to find the solution vectors first. If you can apply this depends weather the possible values change very often, or if they are not altered very much.

schoetbi