views:

380

answers:

4

Hi guys! I have a text file where each line is a path of word strings word1/word2/.../wordn and I want to query the file. I need to build a tree which stores the words and each line of the file as a path so that anytime I search for a word, I get the word-node and all the paths this word belongs to. I was wondering if there is a build-in tree/graph related library in java or if there is a suitable particular tree structure I could use for the current problem. Actually, my basic idea is to construct a tree so that I read the file by line and add the nodes and line-path to that tree. Any ideas-suggestions?

+1  A: 

I'd investigate storing the file in an XML Document and using XPath to search it. Xerces is a good start. Each part of the file (word1/) would be a node with subsequent words (word2) as a child.

Greg Smith
+1  A: 

I would build a class that holds a word and the set of lines that contain that word.

When traversing the file's lines, keep a map (java.util.HashMap or java.util.TreeMap, depending on how you need to use it later) with words (Strings) as keys, and the class above as values. For each word on a line, look it up in the dictionary, and add the line to its entry (or add a new entry if it isn't already there).

Looking up which lines the word occur in is a simple map lookup after you have scanned the file.

Liedman
A: 

My first though is similar to Liedman's, but slightly different: Rather than creating a new class for the lines, just use a Set<String> (HashSet<String>) or List<String> (ArrayList<String>).

R. Bemrose
+1  A: 

What you have is not really a tree at all. I would use a Map<String, List<String>> to store the list of lines that contains each word. This uses O(n) memory and has fast lookup. Example code:

import java.util.*;
import java.io.*;

public class WordNodes
{
    Map<String, List<String>> map = new HashMap<String, List<String>>();

    void readInputFile(String filename) throws IOException, FileNotFoundException
    {
        FileReader fileReader = new FileReader(filename);
        BufferedReader bufferedReader = new BufferedReader(fileReader);
        try
        {
            List<String> lines = new ArrayList<String>();
            String line = null;
            while ((line = bufferedReader.readLine()) != null)
            {
                for (String word: line.split("/"))
                {
                    List<String> list = map.get(word);
                    if (list == null)
                    {
                        list = new ArrayList<String>();
                        map.put(word, list);
                    }
                    list.add(line);
                }
            }
        } finally {
            bufferedReader.close();
        }
    }

    void run() throws IOException, FileNotFoundException
    {
        readInputFile("file.txt");
        InputStreamReader inputStreamReader = new InputStreamReader(System.in);
        BufferedReader bufferedReader = new BufferedReader(inputStreamReader);

        try
        {
            while (true)
            {
                String word = bufferedReader.readLine();
                List<String> lines = map.get(word);
                if (lines == null)
                {
                    System.out.println("Word not found.");
                }
                else
                {
                    for (String line: lines)
                    {
                        System.out.println(line);
                    }
                }
            }
        } finally {
            bufferedReader.close();
        }
    }

    public static void main(String[] args) throws Exception
    {
        new WordNodes().run();
    }
}
Mark Byers