views:

2001

answers:

3

I have an XML feed (which I don't control) and I am trying to figure out how to detect the volume of certain attribute values within the document.

I am also parsing the XML and separating attributes into Arrays (for other functionality)

Here is a sample of my XML

<items>
<item att1="ABC123" att2="uID" />
<item att1="ABC345" att2="uID" />
<item att1="ABC123" att2="uID" />
<item att1="ABC678" att2="uID" />
<item att1="ABC123" att2="uID" />
<item att1="XYZ123" att2="uID" />
<item att1="XYZ345" att2="uID" />
<item att1="XYZ678" att2="uID" />
</items>

I want to find the volume nodes based on each att1 value. Att1 value will change. Once I know the frequency of att1 values I need to pull the att2 value of that node.

I need to find the TOP 4 items and pull the values of their attributes.

All of this needs to be done in C# code behind.

If I was using Javascript I would create an associative array and have att1 be the key and the frequency be the value. But since I'm new to c# I don't know how to duplicate this in c#.

So I believe, first I need to find all unique att1 values in the XML. I can do this using:

IEnumerable<string> uItems = uItemsArray.Distinct();
// Where uItemsArray is a collection of all the att1 values in an array

Then I get stuck on how I compare each unique att1 value to the whole document to get the volume stored in a variable or array or whatever data set.

Here is the snippet I ended up using:

        XDocument doc = XDocument.Load(@"temp/salesData.xml");
        var topItems = from item in doc.Descendants("item")
                    select new
                    {
                        name = (string)item.Attribute("name"),
                        sku = (string)item.Attribute("sku"),
                        iCat = (string)item.Attribute("iCat"),
                        sTime = (string)item.Attribute("sTime"),
                        price = (string)item.Attribute("price"),
                        desc = (string)item.Attribute("desc")

                    } into node
                    group node by node.sku into grp
                    select new { 
                        sku = grp.Key,
                        name = grp.ElementAt(0).name,
                        iCat = grp.ElementAt(0).iCat,
                        sTime = grp.ElementAt(0).sTime,
                        price = grp.ElementAt(0).price,
                        desc = grp.ElementAt(0).desc,
                        Count = grp.Count() 
                    };

        _topSellers = new SalesDataObject[4];
        int topSellerIndex = 0;
        foreach (var item in topItems.OrderByDescending(x => x.Count).Take(4))
        {
            SalesDataObject topSeller = new SalesDataObject();
            topSeller.iCat = item.iCat;
            topSeller.iName = item.name;
            topSeller.iSku = item.sku;
            topSeller.sTime = Convert.ToDateTime(item.sTime);
            topSeller.iDesc = item.desc;
            topSeller.iPrice = item.price;
            _topSellers.SetValue(topSeller, topSellerIndex);
            topSellerIndex++;
        }

Thanks for all your help!

+1  A: 

If you have the values, you should be able to use LINQ's GroupBy...

        XDocument doc = XDocument.Parse(xml);
        var query = from item in doc.Descendants("item")
                    select new
                    {
                        att1 = (string)item.Attribute("att1"),
                        att2 = (string)item.Attribute("att2") // if needed
                    } into node
                    group node by node.att1 into grp
                    select new { att1 = grp.Key, Count = grp.Count() };

        foreach (var item in query.OrderByDescending(x=>x.Count).Take(4))
        {
            Console.WriteLine("{0} = {1}", item.att1, item.Count);
        }
Marc Gravell
+4  A: 

Are you using .NET 3.5? (It looks like it based on your code.) If so, I suspect this is pretty easy with LINQ to XML and LINQ to Objects. However, I'm afraid it's not clear from your example what you want. Do all the values with the same att1 also have the same att2? If so, it's something like:

var results = (from element in items.Elements("item")
              group element by element.Attribute("att1").Value into grouped
              order by grouped.Count() descending
              select grouped.First().Attribute("att2").Value).Take(4);

I haven't tested it, but I think it should work...

  • We start off with all the item elements
  • We group them (still as elements) by their att1 value
  • We sort the groups by their size, descending so the biggest one is first
  • From each group we take the first element to find its att2 value
  • We take the top four of these results
Jon Skeet
yes, every unique att1 value, would contain the some of the same att2 values, for the purposed of this question.
discorax
also, LINQ is SO feakin' powerful...I NEED to spend more time with it! Thanks for the example. Trying it as well as some of the other samples.
discorax
Yup, LINQ is enormously powerful. It's wonderful. Be careful about the "some of the same att2 values" though - in my code it takes the att2 value of the *first* of each of the four most popular att1 values. That may not be what you want...
Jon Skeet
+1  A: 

You can use LINQ/XLINQ to accomplish this. Below is a sample console application I just wrote, so the code might not be optimized but it works.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;
using System.Text;

namespace FrequencyThingy
{
    class Program
    {
        static void Main(string[] args)
        {
            string data = @"<items>
                            <item att1=""ABC123"" att2=""uID"" />
                            <item att1=""ABC345"" att2=""uID"" />
                            <item att1=""ABC123"" att2=""uID"" />
                            <item att1=""ABC678"" att2=""uID"" />
                            <item att1=""ABC123"" att2=""uID"" />
                            <item att1=""XYZ123"" att2=""uID"" />
                            <item att1=""XYZ345"" att2=""uID"" />
                            <item att1=""XYZ678"" att2=""uID"" />
                            </items>";
            XDocument doc = XDocument.Parse(data);
            var grouping = doc.Root.Elements().GroupBy(item => item.Attribute("att1").Value);

            foreach (var group in grouping)
            {
                var groupArray = group.ToArray();
                Console.WriteLine("Group {0} has {1} element(s).", groupArray[0].Attribute("att1").Value, groupArray.Length);
            }

            Console.ReadKey();
        }
    }
}
Jason Jackson