I am using a MSDOS to pipe in a file.. I am trying to write a program that counts how many times each word pair appears in a text file. A word pair consists of two consecutive words (i.e. a word and the word that directly follows it). In the first sentence of this paragraph, the words “counts” and “how” are a word pair.
What i want the program to do is, take this input :
abc def abc ghi abc def ghi jkl abc xyz abc abc abc ---
Should produce this output:
abc:
abc, 2
def, 2
ghi, 1
xyz, 1
def:
abc, 1
ghi, 1
ghi:
abc, 1
kl, 1
jkl:
abc, 1
xyz:
abc, 1
BTW: i am excluding "a", "the", "and" which has nothing to do with the word pair..
What is the best way to do this? please be nice, I am new to java.. this is what i have so far..
import java.util.Scanner;
import java.util.ArrayList;
import java.util.TreeSet;
import java.util.Iterator;
import java.util.HashSet;
public class Project1
{
public static void main(String[] args)
{
Scanner sc = new Scanner(System.in);
String word;
String grab;
int number;
// ArrayList<String> a = new ArrayList<String>();
// TreeSet<String> words = new TreeSet<String>();
HashSet<String> uniqueWords = new HashSet<String>();
System.out.println("project 1\n");
while (sc.hasNext())
{
word = sc.next();
word = word.toLowerCase();
if (word.matches("a") || word.matches("and") || word.matches("the"))
{
}
else
{
uniqueWords.add(word);
}
if (word.equals("---"))
{
break;
}
}
System.out.println("size");
System.out.println(uniqueWords.size());
System.out.println("unique words");
System.out.println(uniqueWords.size());
System.out.println("\nbye...");
}
}
Sorry about the formatting. Its hard to get it right in here...