Hi guys! I have a text file where each line is a path of word strings word1/word2/.../wordn and I want to query the file. I need to build a tree which stores the words and each line of the file as a path so that anytime I search for a word, I get the word-node and all the paths this word belongs to. I was wondering if there is a build-in tree/graph related library in java or if there is a suitable particular tree structure I could use for the current problem. Actually, my basic idea is to construct a tree so that I read the file by line and add the nodes and line-path to that tree. Any ideas-suggestions?
I'd investigate storing the file in an XML Document and using XPath to search it. Xerces is a good start. Each part of the file (word1/) would be a node with subsequent words (word2) as a child.
I would build a class that holds a word and the set of lines that contain that word.
When traversing the file's lines, keep a map (java.util.HashMap or java.util.TreeMap, depending on how you need to use it later) with words (Strings) as keys, and the class above as values. For each word on a line, look it up in the dictionary, and add the line to its entry (or add a new entry if it isn't already there).
Looking up which lines the word occur in is a simple map lookup after you have scanned the file.
My first though is similar to Liedman's, but slightly different: Rather than creating a new class for the lines, just use a Set<String>
(HashSet<String>
) or List<String>
(ArrayList<String>
).
What you have is not really a tree at all. I would use a Map<String, List<String>>
to store the list of lines that contains each word. This uses O(n) memory and has fast lookup. Example code:
import java.util.*;
import java.io.*;
public class WordNodes
{
Map<String, List<String>> map = new HashMap<String, List<String>>();
void readInputFile(String filename) throws IOException, FileNotFoundException
{
FileReader fileReader = new FileReader(filename);
BufferedReader bufferedReader = new BufferedReader(fileReader);
try
{
List<String> lines = new ArrayList<String>();
String line = null;
while ((line = bufferedReader.readLine()) != null)
{
for (String word: line.split("/"))
{
List<String> list = map.get(word);
if (list == null)
{
list = new ArrayList<String>();
map.put(word, list);
}
list.add(line);
}
}
} finally {
bufferedReader.close();
}
}
void run() throws IOException, FileNotFoundException
{
readInputFile("file.txt");
InputStreamReader inputStreamReader = new InputStreamReader(System.in);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
try
{
while (true)
{
String word = bufferedReader.readLine();
List<String> lines = map.get(word);
if (lines == null)
{
System.out.println("Word not found.");
}
else
{
for (String line: lines)
{
System.out.println(line);
}
}
}
} finally {
bufferedReader.close();
}
}
public static void main(String[] args) throws Exception
{
new WordNodes().run();
}
}