tags:

views:

299

answers:

3

Hi,

I have a lot of confusion on understanding the difference between a "far" pointer and "huge" pointer, searched for it all over in google for a solution, couldnot find one. Can any one explain me the difference between the two. Also, what is the exact normalization concept related to huge pointers.

Please donot give me the following or any similar answers:

"The only difference between a far pointer and a huge pointer is that a huge pointer is normalized by the compiler. A normalized pointer is one that has as much of the address as possible in the segment, meaning that the offset is never larger than 15. A huge pointer is normalized only when pointer arithmetic is performed on it. It is not normalized when an assignment is made. You can cause it to be normalized without changing the value by incrementing and then decrementing it. The offset must be less than 16 because the segment can represent any value greater than or equal to 16 (e.g. Absolute address 0x17 in a normalized form would be 0001:0001. While a far pointer could address the absolute address 0x17 with 0000:0017, this is not a valid huge (normalized) pointer because the offset is greater than 0000F.). Huge pointers can also be incremented and decremented using arithmetic operators, but since they are normalized they will not wrap like far pointers."

Here the normalization concept is not very well explained, or may be I'm unable to understand it very well.

Can anyone try explaining this concept from a beginners point of view.

Thanks, Rahamath

+8  A: 

First thing to understand is how a segmented pointer is converted into a linear address. For the example you have, the conversion is:

linear = segment * 16 + offset;

Because of that, it turns out there there the same linear address can be expressed using different segment/offset combinations. For example, the following segment/offset combinations all refer to the same linear address:

0004:0000
0003:0010
0002:0020
0001:0030
0000:0040

The problem with this is that if you have ptr1 with a segmented address of 0100:0000 and ptr2 with a segmented address of 0010:0020, a simple comparison will determine that ptr1 != ptr2 even though they actually point to the same address.

Normalization is the process by which you convert an address to a form such that if two non-normalized pointers refer to the same linear address, they will both be converted to the same normalized form.

R Samuel Klatchko
+1 for mentioning pointer comparison, I forgot that in my answer.
tristopia
Numbers are wrong, 0100:0000 == 00FF:0010 == 00FE:0020, etc.
Hans Passant
@HansPassant - dang, for some reason I used binary for the segment and hex on the index. Will fix that..
R Samuel Klatchko
Thanks for the explaining the concept wonderfull!!! Pat on the Back!!!
wrapperm
+1  A: 

As I recall, it's something like this:

  • Near pointers point to memory in the same segment (as the pointer).
  • Far pointers point to memory in another segment.
  • Huge pointers let you point to memory that's larger than a segment (so you can have a block >64k and do arithmetic on your pointer, and what Samuel said).

If you're a beginner, it's probably best to forget that you heard about Near/Far/Huge. They only have meaning in the old 16-bit segmented memory model commonly seen on early Intel 80x86's. In 32- and 64-bit land (i.e., everything since 1994), memory is just a big contiguous block, so a pointer is just a pointer (as far as a single application is concerned).

Seth
+6  A: 

In the beginning 8086 was an extension of the 8 bit processor 8085. The 8085 could only address 65536 bytes with its 16 bit address bus. When Intel developed the 8086 they wanted the software to be as compatible as possible to the old 8 bit processors, so they introduced the concept of segmented memory addressing. This allowed to run 8 bit software to live in the bigger address range without noticing. The 8086 had a 20 bit address bus and could thus handle up to 1 MB of memory (2^20). Unfortunatly it could not address this memory directly, it had to use the segment registers to do that. The real address was calculated by adding the 16 bit segment value shifted by 4 to the left added to the 16 bit offset.

Example:
Segment  0x1234   Offset 0x5678 will give the real address
   0x 1234
  +0x  5678
  ---------
  =0x 179B8

As you will have noticed, this operation is not bijective, meaning you can generate the real address with other combinations of segment and offset.

   0x 1264               0x 1111
  +0x  5378             +0x  68A8
  ---------             ---------     etc.
  =0x 179B8             =0x 179B8

There are in fact 4096 different combinations possible, because of the 3 overlapping nibbles (3*4 = 12 bits, 2^12 = 4096) . The normalized combination is the only one in 4096 possible values that will have the 3 high nibbles of the offset to zero. In our example it will be:

   0x 179B
  +0x  0008
  ---------
  =0x 179B8

The difference between a far and a huge pointer is not in the normalisation, you can have non normalised huge pointer, it's absolutly allowed. The difference is in the code generated when performing pointer arithmetic. With far pointers when incrementing or adding values to the pointer there will be no overflow handling and you will be only able to handle 64K of memory.

char far *p = (char far *)0x1000FFFF;
p++;
printf("p=%p\n");

will print 1000:0000 For huge pointers the compiler will generate the code necessary to handle the carry over.

char huge *p = (char huge *)0x1000FFFF;
p++;
printf("p=%p\n");

will print 2000:0000

This means you have to be careful when using far or huge pointers as the cost of the arithmetic with them is different.

One should also not forget that most 16 bit compilers had libraries that didn't handle these cases correctly giving sometimes buggy software. Microsofts real mode compiler didn't handle huge pointers on all its string functions. Borland was even worse as even the mem functions (memcpy, memset, etc.) didn't handle offset overflows. That was the reason why it was a good idea to use normalised pointers with these library functions, the likelyhood of offset overflows was lower with them.

tristopia
Thanks for this wonderful explaination!!
wrapperm