views:

49

answers:

2

Hi,

While i was working with an old application with existing database which is in ms-access contains some strange data encoding such as 48001700030E0F465075465A56525E1100121D04121B565A58 as email address

What kind of data encoding is this? i tried base64 but it dosent seems that. Can anybody with previous experience with ms-access could tell me what possible encoding could this be.

edit::

more samples

  1. 54001700030E0F46507546474550481C1D09090D04461B565A195E5F
  2. 40001700030E0F4650755F564E545F06025D100E0C
  3. 38001700030E0F4650754545564654155C101C0C
  4. 46001700030E0F4650755D565150591D1B0007124F565A58

above samples are surely emails and for web url it looks like this

  1. 440505045D070D54585C5B50585D581C1701004F025A58
  2. 440505045D121147544C5B584D4B5D17015D100E4F5C5B

This is vb + ms access program if that can be any help and i think it some standard encoding

edit (2) ::

from looking at web url encoding it seems 0505045D could be for http://

edit(3) ::

1 combination found

52021301161209755354595A5E5F561D170B030E1341461B56585A == [email protected]

+1  A: 

It appears to be bytes encoded as hexadecimal. But what those bytes mean, I don't know. Decoding it to ASCII doesn't reveal much:

H  \x00\x17\x00\x03\x0e\x0fFPu  FZVR^  \x11\x00\x12\x1d\x04\x12\x1bVZX
T  \x00\x17\x00\x03\x0e\x0fFPu  FGEPH  \x1c\x1d\t\t\r\x04F\x1bVZ\x19^_
@  \x00\x17\x00\x03\x0e\x0fFPu  _VNT_  \x06\x02]\x10\x0e\x0c
8  \x00\x17\x00\x03\x0e\x0fFPu  EEVFT  \x15\\\x10\x1c\x0c
F  \x00\x17\x00\x03\x0e\x0fFPu  ]VQPY  \x1d\x1b\x00\x07\x12OVZX

Things I've noticed that may help crack the code:

  • The 2nd to 10th bytes appear to constant \x00\x17\x00\x03\x0e\x0fFPu.
  • The first byte is BCD length (spotted by Daniel Brückner!)
  • 16th bytes onwards appear to some binary format that either encode the data or perhaps a pointer to the data.
  • Two of them end in: \x12?VZX.
Mark Byers
i m sure it is email but how could i reverse engineer it?
Keyur Shah
@Keyur Shah: Could you post more examples? Say 5 to 10 examples? Do you have any encoded addresses where you happen to also know the decoded version, or can make a logical guess on part or all of the address based on for example the name of the person or the company they work for? Do you have other information that you can post? With enough information it may be possible to reverse engineer the format. For example the second and third bytes might be a length encoding. Although if someone can find the specs that would obviously be the easier approach.
Mark Byers
1) 54001700030E0F46507546474550481C1D09090D04461B565A195E5F2) 40001700030E0F4650755F564E545F06025D100E0C3) 38001700030E0F4650754545564654155C101C0C4) 46001700030E0F4650755D565150591D1B0007124F565A58This is vb + ms access program if that can be any help and i think it some standard encoding
Keyur Shah
+1  A: 

The strings seem to be hexadecimal representations of some binary data.

The first two digits are the length of the string - decimal, not hexadecimal - so not the entire string is hexadecimal.

38 001700030E0F465075 4545 5646 5415 5C10 1C0C 
40 001700030E0F465075 5F56 4E54 5F06 025D 100E 0C 
46 001700030E0F465075 5D56 5150 591D 1B00 0712 4F56 5A58 
48 001700030E0F465075 465A 5652 5E11 0012 1D04 121B 565A 58
54 001700030E0F465075 4647 4550 481C 1D09 090D 0446 1B56 5A19 5E5F 
^  ^
|  |
|  +---- constant part, 9 bytes, maybe mailto: or same domain name of
|        reversed email addresses (com.example@foo)
|
+---- length of the reset in decimal, not hexadecimal

I can see no clear indication for the location of the at-sign and the dot before the top-level domain. Seems to be an indication against simple mono-alphabetic substitutions like ROT13.

 [email protected]

 Length

    26 characters

 Histogram

 1x

 h   @   f   l   i   n   g   x   t   .   c

 3x o
 2x p  2x a  2x m  2x r  2x e  2x s

 ASCII values in hexadecimal representation

    70 61 72 65 73 68 40 66 61 6C
    6D 69 6E 67 6F 65 78 70 6F 72
    74 73 2E 63 6F 6D

 The length of 52 hexadecimal symbols matches length of the
 encoded string.

 52 02 13 01 16 12 09 75 53 54 59
    5A 5E 5F 56 1D 17 0B 03 0E 13
    41 46 1B 56 58 5A

 Histogram

 1x

 01  02  03  09  0B  0E  12  16  17  1B
 1D  41  46  53  54  58  59  5E  5F  75

 2x 13   2x 56   2x 5A

The histograms don't match - so this rules out mono-alphabetic substitutions possibly followed by a permutation of the string.

Daniel Brückner
+1 Well spotted. Those bytes appear to be BCD. Probably only the first one byte is the length though, looking at his more recent examples.
Mark Byers