views:

575

answers:

5

I have to parse something like the following "some text <40 byte hash>" can i read this whole thing in to a string without corrupting 40 byte hash part?

The thing is hash is not going to be there so i don't want to process it while reading.

EDIT: I forgot to mention that the 40 byte hash is 2x20 byte hashes no encoding raw bytes.

A: 

SHA-1 hashes are 20 bytes (160 bits) in length. If you are dealing with 40 character hashes, then they are probably an ASCII representation of the hash, and therefore only contain the characters 0-9 and a-f. If this is the case, then you should be able to read and manipulate the strings in Java without any trouble.

Greg Hewgill
Edit to mention that the 40 byte hash is 2 , 20 byte sha-1 hashes no encoding.
Hamza Yerlikaya
A: 

Some more details could be useful, but I think the answer is that you should be okay.

You didn't say how the SHA-1 hash was encoded (common possibilities include "none" (the raw bytes), Base64 and hex). Since SHA-1 produces a 20 byte (160 bit) hash, I am guessing that it will be encoded using hex, since that doubles the space needed to the 40 bytes you mentioned. With that encoding, 2 characters will be used to encode each byte from the hash, using the symbols 0 through 9 and A through F. Those are all ASCII characters so you are safe.

Base64 encoding would also work (though probably not what you asked about since it increases the size by about 1/3 leaving you at well less than 40 bytes) as each of the characters used in Base64 are also ASCII.

If the raw bytes were used directly, you would have a problem, as some of the values are not valid characters.

Adam Batkin
A: 

OK, now that you've clarified that these are raw bytes

No, you cannot read this into Java as a string, you will need to read it as raw bytes.

tialaramex
A: 

WORKING CODE: Converts byte string inputs into hex characters which should be safe in almost all string encodings. Use the code I posted in your other question to decode the hex chars back to raw bytes.

/** Lookup table: character for a half-byte */
    static final char[] CHAR_FOR_BYTE = {'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F'};
    /** Encode byte data as a hex string... hex chars are UPPERCASE */
    public static String encode(byte[] data){
        if(data == null || data.length==0){
            return null;
        }
        char[] store = new char[data.length*2];
        for(int i=0; i<data.length; i++){
            final int val = (data[i]&0xFF);
            final int charLoc=i<<1;
            store[charLoc]=CHAR_FOR_BYTE[val>>>4];
            store[charLoc+1]=CHAR_FOR_BYTE[val&0x0F];
        }
        return new String(store);
    }
BobMcGee
+1  A: 

Read it from your input stream as a byte stream, and then strip the String out of the stream like this:

String s = new String(Arrays.copyOfRange(bytes, 0, bytes.length-40));

Then get your bytes as:

byte[] hash = Arrays.copyOfRange(bytes, s.length-1, bytes.length-1)
hasalottajava