views:

48

answers:

1

I am working on one web application , It's related to machine translation support i.e. which takes source text for translation and translated in to user specified language

Currently it's in unit testing phase.

Here, i want to check that, whether my machine translation feature is fully working for all the special characters. Because of different test cases I stuck at one point where i need all the special characters with classification.

I needed all the special characters listing with classification.

e.g.

1st :

class name : Punctuation

Characters : !?,"| etc

test cases : segment1? segment2! segment3.

2nd :

Class name : HTML entities

characters : all the characters which belong under this class

test cases : respective test cases

3rd :

Class name : Extended ASCII

characters :all the characters which belong under this class

test cases : respective test cases

Please folks provide this, if anyone has any idea or links so that i can make product perfect

Thanks a lot

A: 

Hi and welcome to SO!

Your question is a bit vague, but in general, in the Unicode world characters are classified by "properties" assigned to them. See this php manual page for the basic list of properties.

stereofrog