Seems like NUMA is promising for parallel programming, and if I am not wrong the current latest cpus have built-in support for it, like the i7.
Do you anticipate the CLR to adapt NUMA soon?
EDIT: By this I mean having support for it, and taking advantage of it.
...
I have to run 32-bit code on WinXP or Win2003. Nehalem Xeons (5500 series) should be the fastest, but I'm not sure what'll happen with the memory arrangement. I'm unsure about 2 parts:
To get a maximal speed memory setup, I'll need to install at least 6gb of RAM (to give each CPU 3 sticks to work with). Is the memory interleaved in suc...
Is there any API/way to get the "distance" (called 'hops' in literature) between two NUMA nodes? I want to implement a memory allocation system that takes advantage of this (reuse memory from the nearest node, because the access is faster).
Windows doesn't seem to have such a feature... and libnuma (under Linux) doesn't seem to have it t...
I'm evaluating the performance of an experimental system setup on an 8-core machine with 16GB RAM. I have two main-memory Java RDBMSs (hsqldb) running, and against each of these I run a TPCC client (derived from jTPCC/BenchmarkSQL).
I have scripts to launch things, so e.g. the hsqldb instances are started with:
./hsqld.bash 0 &
./hsqld...
Starting with Win7/Server2008R2 the GetNumaProximityNode(Ex) function is available. It should help retrieve the distance between NUMA nodes, but I can't understand from the documentation (http://msdn.microsoft.com/en-us/library/ms683206(VS.85).aspx) how it's supposed to work. It says that you give it a distance, and it returns the corres...
If I have a multi-processor board that has cache-coherent non-uniform memory access ( NUMA ), i.e. separate "northbridges" with separate RAM for each processor, does any compiler know how to automatically spread the data across the different memory systems such that processes working on local threads are mostly retrieving their data from...
I plan to run 32-bit Windows XP on a workstation with dual processors, based on Intel's Nehalem microarchitecture, and triple channel RAM. Even though XP is limited to 4 GB of RAM, my understanding is that it will function with more than 4 GB installed, but will only expose 4 GB (or slightly less).
My question is: Assuming that 6 GB of ...
Our application is:
Hardware configuration is a dual Xeon server running Windows 7/64bit. Each Xeon has it's own 12gb RAM in a [NUMA][1] configuration with a bridge connecting two memory regions together.
All software is written using VS2008 in c++ and compiled as 64 bit applications.
A Generation app creates a large shared memory...
In our application we are running on a dual Xeon server with memory configured as 12gb local to each processor and a memory bus connecting the two Xeon's. For performance reasons, we want to control where we allocate a large (>6gb) block of memory. Below is simplified code -
DWORD processorNumber = GetCurrentProcessorNumber();
UCHAR ...