views:

174

answers:

2

Hi all

i am developing a script which takes a csv as an input, it then reads the file and insert its contents to a mysql database. So the problem comes while inserting the data to the database. It converts UMLAUT into random characters.

FYI -- My database is in latin_german_ci. [ i have also tried changing it to UTF8]

i am able to display UMLAUT characters in web browsers but when i try to insert them in a database through a sql query, it inserts random characters.

<?php

function uploadCsv($filename){

    echo "filename - ".$filename."<br/>";

    if(isset($filename) || $filename == ""){ // return with an error msg.

    }else{
        $pos = stripos($filename, ".csv");
        if($pos == 0 || $pos != strlen($filename)-4){
            //echo "invalid format";
            return ;
        }
    }

    set_time_limit(0);
    $error = "";
    $row = 0;
    $handle = fopen($filename, "r");
    //echo "<br/>".$filename."<br/>";
    //echo "<br/>".$handle."<br/>";
    if($handle == null){

        return "unable to process";
    }

    while (($data = fgetcsv($handle, 0, ",")) !== FALSE) {

        if ($row == 0) {
            // this is the first line of the csv file
            // it usually contains titles of columns
            // do nothing.
            $num = count($data);
            //echo "num - ".$num."<br/>";
            if($num != 2){
                // echo "returning back";
                return "invalid CSV format.";
            }
        }
        // this handles the rest of the lines of the csv file
        $num = count($data);
        $id = $data[0];
        $inserQuery = "";
        $inserQuery =  "INSERT INTO `table` (
                            `ID` ,
                            `Productname`
                            )
                            VALUES (";

        for ($c=0; $c < $num; $c++) {
            if($c==0){
                $inserQuery .= " '". utf8_encode($data[$c])."'" ;
            }else{
                $inserQuery .= ", '". utf8_encode($data[$c]) ."'" ;
            }
        }
        $inserQuery .= ");";
        echo $inserQuery."<br/>";
        mysql_query($inserQuery);
        if(mysql_affected_rows() == -1 || mysql_affected_rows() <1){
            echo "error<br/>";
        }else{
            echo "row inserted - ".$row." with ID = ".$id." <br/>";
        }
        $row++;
    }
    fclose($handle);
    return "1";
}
?>

Please help....

thanks

+1  A: 

Same procedure... Please see

http://stackoverflow.com/questions/1650591/whether-to-use-set-names/1650834#1650834

and

http://stackoverflow.com/questions/1566602/is-set-character-set-utf8-necessary/1566908#1566908

Essentially the thing is that you have to tell MySQL which character set it should expect from the client (your PHP script).

Stefan Gehrig
And you should check the encoding of the CSV file and your locale settings as well... Because the PHP manual for fgetcsv states : *Note: Locale setting is taken into account by this function. If LANG is e.g. en_US.UTF-8, files in one-byte encoding are read wrong by this function.*
wimvds
How do i change locale settings of CSV file on windows??
mudit
Depends on the program you created the CSV with, e.g. with Excel you're stuck with the default charset (at least as far as I know).
Stefan Gehrig
Generally I tend to believe that reacting on the given encoding in the CSV file is easier on the parsing-side, which is your PHP script. Try to get to know the used encoding and convert the data to UTF-8 using `iconv()` or `mb_convert_encoding()`.
Stefan Gehrig
A: 

Why do you use utf8_encode() for a latin database? Whats the encoding of your webpage (HTTP Content-Type and HTML meta tags)? Whats the encoding of the source code (you can configure this in your editor)? Which program do you use to view the database and which encoding does this use?

rami