views:

473

answers:

4

I want to make an array of Unicode characters, but I don't know how to convert integers into a Unicode representation. Here's the code I have so far

NSMutableArray *uniArray = [[NSMutableArray alloc] initWithCapacity:0];
int i;

for (i = 32; i < 300; i++) {
 NSString *uniString = [NSString stringWithFormat:@"\u%04X", i];
 [uniArray addObject:uniString];
}

Which gives me an error "incomplete universal character name \u"

Is there a better way to build an array of Unicode symbols? Thanks.

+1  A: 

You should use %C to insert a unicode character:

NSMutableArray *uniArray = [[NSMutableArray alloc] initWithCapacity:0];
int i;

for (i = 32; i < 300; i++) {
   NSString *uniString = [NSString stringWithFormat:@"%C", i];
   [uniArray addObject:uniString];
}

Another (better?) way is using stringWithCharacters:

NSMutableArray *uniArray = [[NSMutableArray alloc] initWithCapacity:0];
int i;

for (i = 32; i < 300; i++) {
   NSString *uniString = [NSString stringWithCharacters:(unichar *)&i length:1];
   [uniArray addObject:uniString];
}
Philippe Leybaert
Great, thanks for these. I used the first one, it was just what I needed.
nevan
The main difference is that `%C` takes a `wchar_t`, which is (currently, and on Mac OS X) 32-bit, so you're passing UTF-32 there. `stringWithCharacters:` takes UTF-16.
Peter Hosey
A: 

If you want a single UTF-16 character, [NSString stringWithCharacters:&character length:1]. If it’s UTF-32, you’d have to convert to surrogate pairs, or use -initWithData:encoding:, or try what Philippe said (I’m not sure offhand whether that handle’s UTF-32 properly, but it should).

Ahruman
A: 

The reason for the error is that \u must be followed by four hexadecimal digits at compile time. You've followed it with “%04x”, apparently with the intent of inserting those four hexadecimal digits at run time, which is too late—the compiler has long ago finished its work by then, and the compiler is what is giving you this error.

Peter Hosey
A: 

Yet Another Egregious Example of Regex Usage:

Requires RegexKitLite. Uses the regex (?s). to split a string of unicode characters in to a NSArray. The . regex operator matches everything but new-line characters by default, and the sequence (?s) says Turn on the Dot All regex option which allows . to match new-line character as well. Important since we obviously pass over at least \n in the example below.

#import <Foundation/Foundation.h>
#import "RegexKitLite.h"

// Compile with: gcc -std=gnu99 -o unicodeArray unicodeArray.m RegexKitLite.m -framework Foundation -licucore

int main(int argc, char *argv[]) {
  NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];

  unichar uc[1024];
  for(NSUInteger idx = 0UL; idx < 1024UL; idx++) { uc[idx] = (unichar)idx; }
  NSArray *unicharArray = [[NSString stringWithCharacters:uc length:1024UL] componentsMatchedByRegex:@"(?s)."];

  NSLog(@"array: %@", [unicharArray subarrayWithRange:NSMakeRange(32UL, (1024UL - 32UL))]);

  [pool release];
  return(0);
}
johne