views:

330

answers:

3

hello,experts,i wonder the intel x86 machineCode/assemblyCode conversion is singleSide or bothSide?

means: assemblyCode ---> machineCode and machineCode ---> assemblyCode are both available.

since the x86 machineCode is vary in size (1-15 byte),and opcode vary in (1-3 byte),how to determine one opcode is 1byte or 2byte or 3byte ?

and i never found the example of prefix of x86 instructions,if here is 1byte prefix,how to determine it is prefix or opcode?

certainly, the assemblyCode ---> machineCode , the identity of mnemonics + oprand[w/b] can determine what the response machineCode is by maping certain MappingTable.

but,when the process is reversed:

{ bbbbbbbb,bbbbbbbb,bbbbbbbb, //instruction1 bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb,//instruction2 bbbbbbbb,bbbbbbbb//instruction3 }

----> {bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb,bbbbbbbb}

i don't know which is the significant bits or byts to determined how long(what size) one instruction is.

would any one tells me how to determine that?(the size of opcode,the prefix example.) thanks for help.

+3  A: 

Not sure what you want to accomplish, but since the instructions have variable length the only way to be sure you get back the correctly disassembled code is to start from a known start address. Usually disassemblers start from the starting point of the program and then recursively disassemble all methods called.

However this leads to situations where some code chunks are not disassembled because they can be called from a function table or similar situations, so it needs help from a human usually to see if the remaining sections are code or data.

rslite
... does this concernde by elf or other obj format?
Johnny
yes , disassemble ---> i forgot use this words.i think ,if i can disassemble, then i can assemble.
Johnny
This is can be applied to any object format, but of course the processing needs to take into account the actual format of the file.
rslite
+1  A: 

since the x86 machineCode is vary in size (1-15 byte),and opcode vary in (1-3 byte),how to determine one opcode is 1byte or 2byte or 3byte?

The size of the instructed is implicitly defined by instruction and address mode, you will have to check the ISA one byte at a time what can and should follow said byte.

and i never found the example of prefix of x86 instructions,if here is 1byte prefix,how to determine it is prefix or opcode?

For example, the operand size override prefix (66h) is always a prefix.

Jens Björnhager
thanks, could you write a full prefix+opcode+mode+r/m+oprands example?
Johnny
how about the opcode size? (not the instruction),
Johnny
The Intel Instreuction Set manual has extensive tables of the encoding of opcodes and the mod-reg-r/m byte. Check Chapter 2 of this document, for example:http://www.intel.com/design/intarch/manuals/243191.htm
Jens Björnhager
But 066h can also be a MOD/RM byte or an immediate byte.
Nathan Fellman
+2  A: 

The details you need are in Intel® 64 and IA-32 ArchitecturesSoftware Developer’s Manual Volume 2B: Instruction Set Reference, N-Z. Look at Appendix A, it includes everything you need.

torak
thanks, i will lookup it.
Johnny
thanks, i have readed:Table A-2. One-byte Opcode Map: (00H — F7H) 0F LOCK (Prefix)Table A-3. Two-byte Opcode Map: 00H — 77H (First Byte is 0FH)Table A-3. Two-byte Opcode Map: 08H — 7FH (First Byte is 0FH)Table A-3. Two-byte Opcode Map: 80H — F7H (First Byte is 0FH) Table A-3. Two-byte Opcode Map: 88H — FFH (First Byte is 0FH)
Johnny
Table A-4. Three-byte Opcode Map: 00H — F7H (First Two Bytes are 0F 38H)Table A-4. Three-byte Opcode Map: 08H — FFH (First Two Bytes are 0F 38H)Table A-5. Three-byte Opcode Map: 00H — F7H (First two bytes are 0F 3AH)Table A-5. Three-byte Opcode Map: 08H — FFH (First Two Bytes are 0F 3AH)it is very useful ,thanks.
Johnny