tags:

views:

33

answers:

3

Hi all,

How to upload a word document and find duplicate values in that document using PHP.

thanks in advance

Fero

A: 

The fastest way I can think of would be to import the word DOC using Antiword as demonstrated here: http://davidwalsh.name/read-pdf-doc-file-php. Once you have the file contents you should be able to check for redundancy in many different ways. For example, you could use PHP's strtok function to tokenize and index the words and then look for redundancies in the array. I'm sure once you have the content of the document, finding duplicates won't be a problem.

asnyder
Not in linux. In windows
Fero
A: 
$myFile = "testFile.txt";
$fh = fopen($myFile, 'r');
$theData = fread($fh, filesize($myFile));
fclose($fh);
$array_val = explode(',',$theData);
$len = count($array_val);

for($t=0;$t<$len;$t++)
{
    for($i=$t+1;$i<$len;$i++)
    {   
        if(($array_val[$t]==$array_val[$i]))
        {
            $new_arr[]=$array_val[$t]; 
            break;
        }
    }}

print_r(array_unique($new_arr));
Fero
this is working with text. How to do with doc file.
Fero
A: 

try using COM theres a class written for it at http://drewd.com/2007/01/25/reading-from-a-word-document-with-com-in-php

Jamie