tags:

views:

55

answers:

1

Given a vector of bytes with length multiple of 8, how can I, using mmx instructions, convert all 2's to 5's, for example?

.data
v1 BYTE 1, 2, 3, 4, 1, 2, 3, 4

Thanks.

edit: 2's and 5's are just an example. They are actually parameters of a procedure.

+3  A: 

I'm sure there are several ways to do this. For instance, the following should work:

1) make (or load) a mask of 5's and one of 2's in two mmx registers (mm0-mm7)

2) load data into another mmx register, e.g using MOVQ

3) compare the register holding data to be tested with the mask of 2's, e.g. using PCMPEQB, this will result in a mask of FFh and 00h according to whether the element in the register was 2 or not

4) use MASKMOVQ, the register with 5's and the mask generated by the compare to selectively write out 5's to those positions that previously held 2's. MASKMOVQ will store data for the mask positions that held FFh values.

5) Repeat this until finished.

6) at the end, issue EMMS to exit MMX state. Also issue an SFENCE or MFENCE instruction at the end of the routine (because MASKMOVQ generates a non-temporal hint).

If you use MMX rather than XMM, you won't have to worry about alignment.

Edit: If you are having trouble with the details of the instructions, the Intel® 64 and IA-32 Architectures Software Developer's Manual, Instruction Set Reference (Volumes 2A and 2B), should contain everything you'll ever want to know. You can find them here.

PhiS
Thanks for your reply. I forgot to mention in the asnwer that 2's and 5's are just an example. They are actually parameters of a procedure? How would I be able to programtically generate the masks? Btw, I was looking for some other approach rather than MASKMOVQ. More like and, or, xor, etc...
nunos
(1) Of course, it doesn't really matter whether it'S 2's and 5's. There are several ways to generate the masks of constant bytes, based on your needs and the available instruction sets. E.g., you could either generate them using general-purpose instructinos and them load them into an MMX reg using MOVQ, or you could load from a general purpose-register with MOVD and generate the mask from 1 byte using the PUNPCK.. or PSHUF.. instructions. (2) Of course you can also use PAND and POR etc. to accomplish the same, only since MASKMOVQ already exists, I thought that might be more straightforward.
PhiS