.NET style hashcodes aren't an option unless the possible range of different values for your objects is less than 2^32, as otherwise you will get false-positives (and given the birthday paradox, this can happen more often than you might think even with a great hash fuction). Hashcodes give a quick link to a bucket of zero-or-more items, which are then checked for equality. Hence a hashcode-base solution would require you to store a reference to each object as well anyway, and hence can't be smaller in memory to storing just the references.
If the objects couldn't be garbage collected (i.e. they are still "alive" to another part of the application) then the cost of storing the reference will be 4 or 8 bytes depending on architecture. If they could be GC'd then the cost is dependant on the size of the graph of that object.
Now, if you can create your own lossless hash object of the objects that is smaller than that, you can get a memory saving. E.g.:
public class ObjectOfInterest
{// all fields public for sake of simplicity in example
public int ID; // this is important diff id - diff object.
public int ParID; // this is unimportant, as same for all objects processed here.
public ParentType Parent; // this is just memoised based on _parID;
public decimal Val; // this is important.
public string Name; // unimportant for our purposes.
public RelatedType Stuff; // memoised based on _id
}
Then we can produce a related:
public struct HashObject
{
private readonly int _id;
private readonly decimal _val;
public HashObject(ObjectOfInterest ooi)
{
_id = ooi.ID;
_val = ooi.Val;
}
public bool Matches(ObjectOfInterest ooi)
{
return _id == ooi.ID && _val == ooi.Val;
}
// because one of the options as to how to store *this* is hashing
public bool Equals(HashObject ho)
{
return _id == ho._id && _val == ooi._val;
}
public override bool Equals(object obj)
{
return Equals(obj as HashObject);
}
public int GetHashCode()
{
unchecked
{
return _val.GetHashCode() ^ (_id << 16) ^ (_id >> 16);
}
}
}
Now, we store the HashObjects and use these to note what we've done with. In this case we're going to take up at least 20 bytes storing this struct, plus overhead of whatever means we have to store it. Smaller if the ObjectOfInterest can now be GC'd, pointless if they are still in memory anyway.
There's a hash and equality approach (knowledge of likely values can improve how good the hash is) if you decided to store these in a HashSet themselves. HashSet won't be the most memory efficient collection, though it might be that given the extra strain you are putting on this in all of these comparisons, that you do need the faster look up. This is area for experimentation over theory (esp. since the details change according to your objects). If you can take the look-up time complexity of constantly scanning arrays, then that's your best bet.
If there isn't a possible smaller object than you original type that allows for a full relevantly-equal comparison, then this approach can't work.