views:

257

answers:

3

Hi,

Based on this answer http://stackoverflow.com/questions/288810/get-the-subdomain-from-a-url, I am trying to use code.google.com/p/domainname-parser/ that will use Public Suffix List from publicsuffix.org to get subdomain and domain for managing my cookie collection.

In the current moment, Domainname-parser is the only .NET code I found in the internet that implement list from publicsuffix.org.

In order to use Domainname-parser I want to make a changes to the source code so that it would be able to:

  1. Use in .NET 2.0
  2. Accept Uri object to parse the Host into subdomain, domain and TLD.
  3. Will auto download the latest list from the publicsuffix.org by using WebRequest and WebResponse if LastModified is changed.

So it will become more usable and always updated. (2) and (3) would not be a problem but (1) is my focus now.

The current Domainname-parser is v1.0, build to use .NET 3.5 that using Linq in the code. To make it compatible to .NET 2.0, I need to convert the Linq codes to non-Linq and it makes me to understand the Public Suffix List rules. That is Normal, Wildcard and Exception rule. However I don't have knowledge about Linq and how it can be converted back to the normal way.

Converter tools might be useful but the better is I review and modify it line by line.

Now my question is how can I convert it? Eg codes from FindMatchingTLDRule method in DomainNames class:

//  Try to match an wildcard rule:
var wildcardresults = from test in TLDRulesCache.Instance.TLDRuleList
                      where
                        test.Name.Equals(checkAgainst, StringComparison.InvariantCultureIgnoreCase)
                        &&
                        test.Type == TLDRule.RuleType.Wildcard
                      select
                        test;

and also this:

        var results = from match in ruleMatches
                      orderby match.Name.Length descending
                      select match;

What is the simple guide line to follow? or any free tools to convert this one sentence above to the normal C# codes in .NET 2.0.?

I believe that no database involved, just in the way they deals with collections.

I also trying to contact the domainname-parser owner to improve the codes and help me to solve this.

Thanks

CallMeLaNN

+1  A: 

Okay, in response to comments, here's a version returning a list.

public List<TLDRule> MatchWildcards(IEnumerable<TLDRule> rules,
                                    string checkAgainst)
{
    List<TLDRule> ret = new List<TLDRule>();
    foreach (TLDRule rule in rules)
    {
        if (rule.Name.Equals(checkAgainst, 
                             StringComparison.InvariantCultureIgnoreCase)
            && rule.Type == TLDRule.RuleType.Wildcard)
        {
            ret.Add(rule);
        }
    }
    return ret;
}

Then:

List<TLDRule> wildcardresults = MatchWildcards(
    TLDRulesCache.Instance.TLDRuleList, checkAgainst);

However, if you're converting a lot of code (and if you really have to convert it - see below) you should really learn more about LINQ. You're pretty much bound to use it eventually, and if you understand how it works you'll be in a much better position to work out how to do the conversion. Most recent C# books cover LINQ; if you have my own book (C# in Depth) then chapters 8-11 will cover everything you need to know for LINQ to Objects.

Another alternative if you're able to use VS2008 but just target .NET 2.0 is to use LINQBridge which is a reimplementation of LINQ to Objects for .NET 2.0... and it's now open source :)

Jon Skeet
Thanks for the fast response. It seems to work.Now I want to change IEnumerable<TLDRule> to List<TLDRule>, so that I can use like wildcardresults.Count.So the new statement become: List<TLDRule> exceptionresults = (List<TLDRule>)MatchExceptions(TLDRulesCache.Instance.TLDRuleList, checkAgainst);because I can't change the method IEnumerable<TLDRule> MatchExceptions() to List<TLDRule> MatchExceptions() because of using keyword "yield".I think that you mean by the type.I dont see any codes that change the checkAgainst value.
CallMeLaNN
Okay, I'll edit the method.
Jon Skeet
Oic, thats the yield means.TQVM.Just another code I edit. I confuse the "orderby". Different cases. I dont know how to compare in the if statement to test the match.Name.Length to compare with the previous match object.Almost done.
CallMeLaNN
What does this means? TLDRule primaryMatch = results.Take(1).SingleOrDefault()I think take 1 might be take the first, results[0]. What is the SingleOrDefault()?
CallMeLaNN
I think it would be fruitless to try to explain the whole of LINQ this way, clause by clause. I *strongly* suggest you learn about LINQ from a book or tutorial.
Jon Skeet
I got this:ruleMatches.Sort(delegate(TLDRule match1, TLDRule match2) { return match2.Name.Length.CompareTo(match1.Name.Length); });Thanks guy. Really helpful.
CallMeLaNN
A: 

Hi,

Thanks to Jon Skeet that really helps me. It works very well and all UnitTest passed successfully.

Here I want to share the answer to anybody want to use Domainname-parser in .NET 2.0

1 Change this codes (DomainName.cs)

            //  Try to match an exception rule:
            var exceptionresults = from test in TLDRulesCache.Instance.TLDRuleList
                                   where
                                     test.Name.Equals(checkAgainst, StringComparison.InvariantCultureIgnoreCase)
                                     &&
                                     test.Type == TLDRule.RuleType.Exception
                                   select
                                     test;

            //  Try to match an wildcard rule:
            var wildcardresults = from test in TLDRulesCache.Instance.TLDRuleList
                                  where
                                    test.Name.Equals(checkAgainst, StringComparison.InvariantCultureIgnoreCase)
                                    &&
                                    test.Type == TLDRule.RuleType.Wildcard
                                  select
                                    test;

            //  Try to match a normal rule:
            var normalresults = from test in TLDRulesCache.Instance.TLDRuleList
                                where
                                  test.Name.Equals(checkAgainst, StringComparison.InvariantCultureIgnoreCase)
                                  &&
                                  test.Type == TLDRule.RuleType.Normal
                                select
                                  test;

into this:

List<TLDRule> exceptionresults = MatchRule(TLDRulesCache.Instance.TLDRuleList, checkAgainst, TLDRule.RuleType.Exception);
List<TLDRule> wildcardresults = MatchRule(TLDRulesCache.Instance.TLDRuleList, checkAgainst, TLDRule.RuleType.Wildcard);
List<TLDRule> normalresults = MatchRule(TLDRulesCache.Instance.TLDRuleList, checkAgainst, TLDRule.RuleType.Normal);

    private static List<TLDRule> MatchRule(List<TLDRule> rules, string checkAgainst, TLDRule.RuleType ruleType)
    {
        List<TLDRule> matchedResult = new List<TLDRule>();
        foreach (TLDRule rule in rules)
        {
            if (rule.Name.Equals(checkAgainst, StringComparison.InvariantCultureIgnoreCase)
                && rule.Type == ruleType)
            {
                matchedResult.Add(rule);
            }
        }
        return matchedResult;
    }

2 Change this:

        //  Sort our matches list (longest rule wins, according to :
        var results = from match in ruleMatches
                      orderby match.Name.Length descending
                      select match;

        //  Take the top result (our primary match):
        TLDRule primaryMatch = results.Take(1).SingleOrDefault();

into this

        TLDRule primaryMatch = null;
        if (ruleMatches.Count > 0)
        {
            // match2 CompareTo match1 (reverse order) to make the descending
            ruleMatches.Sort(delegate(TLDRule match1, TLDRule match2) { return match2.Name.Length.CompareTo(match1.Name.Length); });
            primaryMatch = ruleMatches[0];
        }

3 change this (TLDRulesCache.cs)

            IEnumerable<TLDRule> lstTLDRules = from ruleString in lstTLDRuleStrings
                                               where
                                               !ruleString.StartsWith("//", StringComparison.InvariantCultureIgnoreCase)
                                               &&
                                               !(ruleString.Trim().Length == 0)
                                               select new TLDRule(ruleString);

into this

List<TLDRule> lstTLDRules = ListTLDRule(lstTLDRuleStrings);

    private static List<TLDRule> ListTLDRule(List<string> lstTLDRuleStrings)
    {
        List<TLDRule> lstTLDRule = new List<TLDRule>();
        foreach (string ruleString in lstTLDRuleStrings)
        {
            if (!ruleString.StartsWith("//", StringComparison.InvariantCultureIgnoreCase)
                &&
                !(ruleString.Trim().Length == 0))
            {
                lstTLDRule.Add(new TLDRule(ruleString));
            }
        }
        return lstTLDRule;
    }

Some others is a small things like:

List<string> lstDomainParts = domainString.Split('.').ToList<string>();

change to:

List<string> lstDomainParts = new List<string>(domainString.Split('.'));

and removing .ToList() like in

"var exceptionresults" will be use exceptionresults.ToList() to get the List. Since "var exceptionresults" change to "List exceptionresults" .ToList() should be removed.

CallMeLaNN

CallMeLaNN
A: 

Hi, I need to implement getting the list from publicsuffix.org via an http get request rather than having it locally and pointing the app.config because I need to distribute an app as an .exe that uses the doman-parser dll. how would you implement this? How do you get request a file and then store it at a specific location? any help would be greatly appreciated. Thanks!

Mike Mitchell
I would to edit it to support auto download and contacting to the owner to contribute in the Google code. Just using HttpWebRequest/Response.
CallMeLaNN