tags:

views:

76

answers:

1

Hi friends! first excuse me for my bad english! I want to search in pdf document for a word like "Hello" . So I must read each page in pdf by PdfTextExtractor. I did it well. I can read all words in each page separately an save it in string buffer. but when i push this code in For loop ,(for example from page 1 to 7 for search in it) earlier page's words will remain in string buffer.I hop you understand my problem. Tanx all. this is my code :

        PdfReader reader2 = new PdfReader(openFileDialog1.FileName);
        int pagen = reader2.NumberOfPages;
        reader2.Close();
        ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy();
        for (int i = 1; i < pagen; i++)
        {
            textBox1.Text = "";
            PdfReader reader = new PdfReader(openFileDialog1.FileName);

            String  s = PdfTextExtractor.GetTextFromPage(reader, i, its);
            //MessageBox.Show(s.Length.ToString());
            //PdfTextArray h = new PdfTextArray(s);

            //
            // s = "";
            s = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(s)));
            textBox1.Text = s;
            reader.Close();

}

A: 

SimpleTextExtractionStrategy doesn't let you reset it unfortunately, so you must move your "new SimpleTextExtractionStrategy()" inside the loop instead of reusing the same object.

Mark Storer