views:

41

answers:

0

Hello, I am trying to use PHP to scrape some japanese text off a site and insert it in to a sqlite or MySQL DB.

The first problem I am running in to is actually getting the regex to pull the line of text I need from the site:

<h3><b>タイトル</b> : I_want_this_text</h3>

The text I want (I_want_this_text) is in japanese of course.

The regex I use is this:

/<b>タイトル<\/b> : (.*?)<\/h3>/

The regex to the "I_want_this_text" is not difficult, but I am coding this in Notepad++ and when I copy this in to Notepad++ I only get question marks (??) in place of the japanese code:

<h3><b>????</b> : ?????????? ?????????????????????????</h3>

First question: How can I get PHP (or notepad++) to recognize the japanese text?

Second question: How can I make sure when I insert this japanese text in to my database it will stay in japanese and that later on I can pull it and display it correctly?

Thanks, Rafael