byte[] a = {1,2,3,0,1,2,3,0,0,0,0,4};
String s0 = new String(a, "ISO-8859-1");
String s1 = s0.replaceAll("\\x00{4,}", "");
byte[] r = s1.getBytes("ISO-8859-1");
System.out.println(Arrays.toString(r)); // [1, 2, 3, 0, 1, 2, 3, 4]
I used ISO-8859-1 (latin1) because, unlike any other encoding,
every byte in the range 0x00..0xFF
maps to a valid character, and
each of those characters has the same numeric value as its latin1 encoding.
That means the string is the same length as the original byte array, you can match any byte by its numeric value with the \xFF
construct, and you can convert the resulting string back to a byte array without losing information.
I wouldn't try to display the data while it's in string form--although all the characters are valid, many of them are not printable. Also, avoid manipulating the data while it's in string form; you might accidentally do some escape-sequence substitutions or another encoding conversion without realizing it. In fact, I wouldn't recommend doing this kind of at all, but that isn't what you asked. :)
Also, be aware that this technique won't necessarily work in other programming languages or regex flavors. You would have to test each one individually.