views:

71

answers:

1

Hi scene!

what i have is a textbox into which the user types a string. The string will look something like this: G32:04:20:40

Then the user hits the search button. The program must then open a textfile and search for the "nearest" five strings to the one they entered and display them in a listbox.

I'll define "nearest string" as much as i can (most likely using a very long complicated example).

The data contained in the text file looks like this:

G32:63:58:11 JG01
G32:86:98:30 JG01
G33:50:05:11 JG06
G33:03:84:12 JG05
G34:45:58:11 JG07
G35:45:20:41 JG01
G35:58:20:21 JG03

So if the user types in the string :

G33:89:03:20

The five closest results should display in the list box like this:

G33:50:05:11 JG06
G33:03:84:12 JG05
G32:86:98:30 JG01
G32:63:58:11 JG01
G34:45:58:11 JG07

I should probably point out at this point that strings are coordinates and the value after "JG" represents the value of something at that coordinate.

The way i got to those 5 is by going through the string piece by piece. So the user typed in "G33" so i find all those with G33 at the beginning - if there are none then i find the closest to G33. Then it was "89" so i find all those where the next part is "89" if there are none, then the closest to 89 the better and so on.

What i need to know is how do i go about doing this? I've built the visual components and i also have code in place that deals with similar kinds of things but when it comes to this i'm truely stumped. As you can probably tell by now, i'm rather new to C#, but i'm learning :)

EDIT: Search Code

private void btnSearch_Click(object sender, EventArgs e)
        {
            lstResult.Items.Clear();

            if (txtSearch.Text == String.Empty)
            {
                MessageBox.Show("The textbox is empty, there is nothing to search.",
                    "Textbox empty", MessageBoxButtons.OK, MessageBoxIcon.Information);
            }
            else
            {
                this.CheckFormatting();
            }

        }

        private long GetIndexForCoord(string coord)
        {
            // gets out a numerical value for each coordinate to make it easier to compare
            Regex m_regex = new Regex("\\d\\d:\\d\\d:\\d\\d:\\d\\d");
            string cleaned = m_regex.Match(coord).Value;
            cleaned = cleaned.Replace(':', '0');
            return Convert.ToInt64(cleaned);
        }

        private List<string> GetResults(string coord)
        {
            // gets out the 5 closest coordinates
            long index = GetIndexForCoord(coord);

            // First find the 5 closest indexes to the one we're looking for
            List<long> found = new List<long>();
            while (found.Count < 5)
            {
                long closest = long.MaxValue;
                long closestAbs = long.MaxValue;
                foreach (long i in m_indexes)
                {
                    if (!found.Contains(i))
                    {
                        long absIndex = Math.Abs(index - i);
                        if (absIndex < closestAbs)
                        {
                            closest = i;
                            closestAbs = absIndex;
                        }
                    }
                }
                if (closest != long.MaxValue)
                {
                    found.Add(closest);
                }
            }

            // Then use those indexes to get the coordinates from the dictionary
            List<string> s = new List<string>();
            foreach (long i in found)
            {
                s.Add(m_dic[i]);
            }
            return s;
        }

        private void CheckFormatting()
        {
            StringReader objReader = new StringReader(txtSearch.Text);

            bool FlagCheck = true;

                if (!Regex.IsMatch(txtSearch.Text,
                    "G3[0-9]{1}:[0-9]{2}:[0-9]{2}:[0-9]{2}"))
                {
                    FlagCheck = false;
                }

            if (FlagCheck == true)
            {
                this.CheckAndPopulate();
            }
            else
            {
                MessageBox.Show("Your search coordinates are not formatted correctly.",
                       "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
            }
        }

        private void CheckAndPopulate()
        {
            StreamReader objReader = new StreamReader("Jumpgate List.JG");
            List<String> v = new List<String>();
            do
            {
                v.Add(objReader.ReadLine());
            }
            while (objReader.Peek() != -1);

            objReader.Close();

            foreach (string c in v)
            {
                long index = GetIndexForCoord(c);
                m_dic.Add(index, c);
                m_indexes.Add(index);
            }

            List<string> results = GetResults(txtSearch.Text);
            foreach (string c in results)
            {
                lstResult.Items.Add(c);
            }
        }
+3  A: 

Edit: Added suggested code to read in the file. And also an explanation at the end.
Edit2: Added if check around the adding to the dictionary/list to handle duplicates.

Please note, this code is almost certainly quite inefficient (except possibly by accident) and would need to be cleaned up and error handling would need to be added etc, but it might give you a starting point for writing better code at least. You probably want to take a look at k-nearest neighbor algorithm to find out the proper way of doing this.

I've assumed that the letter in the beginning is always G and that all 4 parts of the coordinates are always 2 digits each.

Add the following 2 to your class/form:

    Dictionary<long, string> m_dic = new Dictionary<long, string>();
    List<long> m_indexes = new List<long>();

Then initialize them with the following code (I've assumed you've already read in all the coordinates into a string array called v with one coordinate per item):

foreach (string c in v)
{
    long index = GetIndexForCoord(c);
    if(!m_dic.ContainsKey(index))
    {
        m_dic.Add(index, c);
        m_indexes.Add(index);
    }
}

Then add the following 2 methods:

// gets out a numerical value for each coordinate to make it easier to compare
private long GetIndexForCoord(string coord)
{
    Regex m_regex = new Regex("\\d\\d:\\d\\d:\\d\\d:\\d\\d");
    string cleaned = m_regex.Match(coord).Value;
    cleaned = cleaned.Replace(':', '0');
    return Convert.ToInt64(cleaned);
}
// gets out the 5 closest coordinates
private List<string> GetResults(string coord)
{
    long index = GetIndexForCoord(coord);

    // First find the 5 closest indexes to the one we're looking for
    List<long> found = new List<long>();
    while (found.Count < 5)
    {
            long closest = long.MaxValue;
            long closestAbs = long.MaxValue;
            foreach (long i in m_indexes)
            {
                if (!found.Contains(i))
                {
                    long absIndex = Math.Abs(index - i);
                    if (absIndex < closestAbs)
                    {
                        closest = i;
                        closestAbs = absIndex;
                    }
                }
            }
            if (closest != long.MaxValue)
            {
                found.Add(closest);
            }
    }

    // Then use those indexes to get the coordinates from the dictionary
    List<string> s = new List<string>();
    foreach (long i in found)
    {
        s.Add(m_dic[i]);
    }
    return s;
}

And finally when the user enter the data you send in that data to the method as:

List<string> results  = GetResults(lookingFor);

You can then use the results to populate your listbox.

The code works by converting each coordinate to a numerical value called index (since it's easier to work with) and it then adds all the coordinates to a dictionary with the index as the key.
When it's looking up the closest coordinates it compares the difference in value between the index you're looking for and each of the previously stored indexes to find the 5 closest ones (it uses the Math.Abs method so that it can get the difference without having to worry about negative numbers). It's quite inefficient since it loops through each value once for each coordinate you want to find (so if your list contains 1000 coordinates and you want to find the 5 closest, it'll go through the inner loop 5000 times, I'd assume that could probably be cut down to just 1000 times by improving the code, I suggest looking at the wiki link close to the top of this answer for a better algorithm).

ho1
points for the usefulness. I will see if i can turn this into C# (it looks like VB.NET to me). I'll also take a look at the link you posted. Thanks
Arcadian
@Arcadian: Sorry about that, not sure why I thought you wanted it in VB, should be better now though.
ho1
lol np, i have used VB before, thank you for changing it though.
Arcadian
See my edit. I think that would be what i need to read the test file but i'm not sure. Also, could you please explain the last part of the code to me?
Arcadian
@Arcadian: Your file reading code look ok but I'd suggest using a `using` statement for the `StreamReader`, that way you don't have to worry about closing it (or if an exception is thrown). I added some explanatations to the end of the answer, probably not that clear still but hopefully help a little bit.
ho1
ah yes. sorry i should have explained a little better. I understood roughly how that part of the code worked. The part i did not quite understand is the last method used to populate the listbox.
Arcadian
@Arcadian: Ok, that bit just gets the results out in a list, and then you can just do a `foreach(string c in results)` loop on that list and add them into the listbox `listBox1.Items.Add(c)`. Just remember to call `listBox1.Items.Clear()` to clear out old ones.
ho1
Ok I understand that. What does the GetResults(LookingFor) part mean though? i know that it turns the results into a list but "lookingfor" doesn't exist in this context.
Arcadian
@Arcadian: That's where you can send in the string the user has entered (the one he's looking for :))
ho1
you mean `txtSearch.Text`? Assuming thats right, i get another error on this line: `m_dic.Add(index, c);`. The error msg is: "an item with the same key has already been added".
Arcadian
@Arcadian: Sounds ok with `txtSearch.Text`. Regarding the error, my first guess would be that you have duplicate lines in the file with coordinates? If so, just put a an if around it. I'll update my answer to show that.
ho1
Thanks for your help but other parts of the code prevent duplicates being written to (and therefore read from) the text file. Also, i can find no change in your code from what i posted in my edit above. Are you sure you edited your answer?
Arcadian
@Arcadian: Yes, look for the line `if(!m_dic.ContainsKey(index))`. If that's not the issue it might be that one of my assumptions on what your data looks like is wrong.
ho1
Ah yes. thank you. That got it.
Arcadian