IMHO, the point of the test virus is to have something that is both known to be harmless, and accepted as a virus so that end users can verify that the AV software is turned on, and can see the effect of a virus identification. Think fire drill, for AV software.
I would imagine that most have a signature for it, and directly recognize it as such.
I wouldn't be surprised if the bit pattern of the actual EICAR test happened to include bit patterns that smelled like opcodes for suspicious activity, but I don't know if that is the case. If it is, then it might be valid test of a simple heuristic virus recognizer. However, since the EICAR test has been around for a long time, I would also imagine that any heuristic that caches it isn't good enough to catch anything now in the wild.
I wouldn't expect that recognizing EICAR is proof of any claim stronger than "the AV is installed and scanning what it was expected to scan", and if developing an AV system, I wouldn't attempt to make any stronger claim about it.
Update:
The actual EICAR test virus is the the following string:
X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*
which was carefully crafted (according to the Wikipedia article) to have a couple of interesting properties.
First, it consists of only printable ASCII characters. It will often include whitespace and/or a newline at the end, but that has no effect on its recognition, or on its function.
Which raises the second property: it is in fact an executable program for an 8086 CPU. It can be saved (via Notepad, for example) in a file with the extension .COM, and it can be run on MSDOS, most clones of MSDOS, and even in the MSDOS compatibility mode of the Windows command prompt (including on Vista, but not on any 64-bit Windows since they decided that compatibility with 16-bit real mode was no longer a priority.)
When run, it produces as output the string "EICAR-STANDARD-ANTIVIRUS-TEST-FILE!" and then exits.
Why did they go to this effort? Apparently the researchers wanted a program that was known to be safe to run, in part so that live scanners could be tested without needing to capture a real virus and risk a real infection. They also wanted it to be easy to distribute by both conventional and unconventional means. Since it turns out that there is a useful subset of the x86 real-mode instruction set where every byte meets the restriction that it also be a printable ASCII character, they achieved both goals.
The wiki article has a link to a blow-by-blow explanation of how the program actually works which is also an interesting read. Adding to the complexity is the fact that the only way to either print to the console or exit a program in DOS real mode is to issue a software interrupt instruction, whose opcode (0xCD) is not a printable 7-bit ASCII character. Furthermore, the two interrupts each require a one byte immediate parameter, one of which would need to be a space character. Since the self-imposed rule was to not allow spaces, all four of the last bytes of the program ("H+H*" in the string) are modified in place before the instruction pointer gets there to execute them.
Disassembling and dumping EICAR.COM with the DEBUG command at a command prompt on my XP box, I see:
0C32:0100 58 POP AX
0C32:0101 354F21 XOR AX,214F
0C32:0104 50 PUSH AX
0C32:0105 254041 AND AX,4140
0C32:0108 50 PUSH AX
0C32:0109 5B POP BX
0C32:010A 345C XOR AL,5C
0C32:010C 50 PUSH AX
0C32:010D 5A POP DX
0C32:010E 58 POP AX
0C32:010F 353428 XOR AX,2834
0C32:0112 50 PUSH AX
0C32:0113 5E POP SI
0C32:0114 2937 SUB [BX],SI
0C32:0116 43 INC BX
0C32:0117 43 INC BX
0C32:0118 2937 SUB [BX],SI
0C32:011A 7D24 JGE 0140
0C32:0110 45 49 43 41 EICA
0C32:0120 52 2D 53 54 41 4E 44 41-52 44 2D 41 4E 54 49 56 R-STANDARD-ANTIV
0C32:0130 49 52 55 53 2D 54 45 53-54 2D 46 49 4C 45 21 24 IRUS-TEST-FILE!$
0C32:0140 48 DEC AX
0C32:0141 2B482A SUB CX,[BX+SI+2A]
After executing instructions up to JGE 0140
, the last two instructions have been modified to be:
0C32:0140 CD21 INT 21
0C32:0142 CD20 INT 20
Most DOS system calls were dispatched through INT 21
with the value of the AH
or AX
register specifying the function to execute. In this case, AH
is 0x09, which is the print string function, which prints the string starting at offset 0x011C, terminated by the dollar sign. (You had to print a dollar sign with a different trick in pure DOS.) The INT 20
call terminates the process before any extra bytes past that point can be executed.
Self-modifying code was an early virus trick, but here it is used to preserve the restriction on byte values that can be used in the string. In a modern system, it is possible that the data execution protection feature would catch the modification, if that is enforced on MSDOS compatibility mode running a COM file.