views:

3197

answers:

3

I am using Delphi 7 and ICS components to communicate with php script and insert some data in mysql database...

How to post unicode data using http post ?

After using utf8encode from tnt controls I am doing it to post to PHP script

<?php
echo "Note = ". $_POST['note'];

if($_POST['action'] == 'i')
 {
    /*
     * This code will add new notes to the database
     */
    $sql = "INSERT INTO app_notes VALUES ('', '" . mysql_real_escape_string($_POST['username']) . "', '" . mysql_real_escape_string($_POST['note']) . "', NOW(), '')";
    $result = mysql_query($sql, $link) or die('0 - Ins');
    echo '1 - ' . mysql_insert_id($link);
?>

Delphi code :

  data := Format('date=%s&username=%s&password=%s&hash=%s&note=%s&action=%s',
                   [UrlEncode(FormatDateTime('yyyymmddhh:nn',now)),
                    UrlEncode(edtUserName.Text),
                    UrlEncode(getMd51(edtPassword.Text)),
                    UrlEncode(getMd51(dataHash)),UrlEncode(Utf8Encode(memoNote.Text)),'i'
                    ]);

//  try  function StrHtmlEncode (const AStr: String): String; from IdStrings

    HttpCli1.SendStream := TMemoryStream.Create;
    HttpCli1.SendStream.Write(Data[1], Length(Data));
    HttpCli1.SendStream.Seek(0, 0);
    HttpCli1.RcvdStream := TMemoryStream.Create;
    HttpCli1.URL := Trim(ActionURLEdit.Text);
    HttpCli1.PostAsync;

But when I post that unicode value is totally different then original one that I see in Tnt Memo

Is there something I am missing ?!

Also anybody knows how to do this with Indy?

Thanks.

A: 

I would expect (without knowing for sure) that you'd have to output them as &#nnnnn entities (with the number in decimal rather than hex ... I think)

boost
+2  A: 

Encode the UTF-8 data in application/x-www-form-urlencoded. This will ensure that the server can read the data over the http connection

Martin OConnor
+3  A: 

Your example code shows your data coming from a TNT Unicode control. That value will have type WideString, so to get UTF-8 data, you should call Utf8Encode, which will return an AnsiString value. Then call UrlEncode on that value. Make sure UrlEncode's input type is AnsiString. So, something like this:

var
  data, date, username, passhash, datahash, note: AnsiString;

date := FormatDateTime('yyyymmddhh:nn',now);
username := Utf8Encode(edtUserName.Text);
passhash := getMd51(edtPassword.Text);
datahash := getMd51(data);
note := Utf8Encode(memoNote.Text);
data := Format('date=%s&username=%s&password=%s&hash=%s&note=%s&action=%s',
               [UrlEncode(date),
                UrlEncode(username),
                UrlEncode(passhash),
                UrlEncode(datahash),
                UrlEncode(note),
                'i'
               ]);

There should be no need to UTF-8-encode the MD5 values since MD5 string values are just hexadecimal characters. However, you should double-check that your getMd51 function accepts WideString. Otherwise, you may be losing data before you ever send it anywhere.

Next, you have the issue of receiving UTF-8 data in PHP. I expect there's nothing special you need to do there or in MySQL. Whatever you store, you should get back identically later. Send that back to your Delphi program, and decode the UTF-8 data back into a WideString.

In other words, your Unicode data will look different in your database because you're storing it as UTF-8. In your database, you're seeing UTF-8-encoded data, but in your TNT controls, you're seeing the regular Unicode characters.

So, for instance, if you type the character "ش" into your edit box, that's Unicode character U+0634, Arabic letter sheen. As UTF-8, that's the two-byte sequence 0xD8 0xB4. If you store those bytes in your database, and then view the raw contents of the field, you may see characters interpreted as though those bytes are in some ANSI encoding. One possible interpretation of those bytes is as the two-character sequence "Ø´", which is the Latin capital letter o with stroke followed by an acute accent.

When you load that string back out of your database, it's still encoded as UTF-8, just as it was when you stored it, so you will need to decode it. As far as I can tell, neither PHP nor MySQL does any massaging of your data, so whatever UTF-8 character you give them will be returned to you as-is. If you are using the data in Delphi, then call Utf8Decode, which is the complement to the Utf8Encode function that you called previously. If you are using the data in PHP, then you might be interested in PHP's utf8_decode function, although that converts to ISO-8859-1, which doesn't include our example Arabic character. Stack Overflow already has a few questions related to using UTF-8 in PHP, so I won't attempt to add to them here. For example:

Rob Kennedy
I did this but when I post ش then I got Ø´1 in mysql database and as a response? My additional question is how php is accepting uft-8 from http?
Irfan Mulic