parallel-processing

Number of threads and thread numbers in Grand Central Dispatch

I am using C and Grand Central Dispatch to parallelize some heavy computations. How can I get the number of threads used by GCD? Also is it possible to know on which thread a piece of code is currently running on? Basically I'd like to use sprng (parallel random numbers) with multiple streams and for that I need to know what stream id to...

Is there something like clock() that works better for parallel code?

So I know that clock() measures clock cycles, and thus isn't very good for measuring time, and I know there are functions like omp_get_wtime() for getting the wall time, but it is frustrating for me that the wall time varies so much, and was wondering if there was some way to measure distinct clock cycles (only one cycle even if more tha...

Programming for Multi core Processors

As far as I know, the multi-core architecture in a processor does not effect the program. The actual instruction execution is handled in a lower layer. my question is, Given that you have a multicore environment, Can I use any programming practices to utilize the available resources more effectively? How should I change my code to gai...

Parallel implementation to maximise network flow

I am preparing a paper on the parallel implementation of goldberg's algorithm for maximizing network flow. Please help me with the implementation details. if someone has access to acm library i would really appreciate if you can get me the paper by anderson and setubal on it. - [email protected] ...

JVM (embarrasingly) parallel processing libraries/tools

I am looking for something that will make it easy to run (correctly coded) embarrassingly parallel JVM code on a cluster (so that I can use Clojure + Incanter). I have used Parallel Python in the past to do this. We have a new PBS cluster and our admin will soon set up IPython nodes that use PBS as the backend. Both of these systems mak...

How can I reduce build time using a parallel build / parallelizing the tests?

Simple question :) How can I reduce build time using a parallel build / parallelizing the tests? We're using TeamCity, JUnit, Fit, Selenium ...

How to setup matlabpool for multiple processors?

I just setup a Extra Large Heavy Computation EC2 instance to throw it at my Genetic Algorithms problem, hoping to speed up things. This instance has 8 Intel Xeon processors (around 2.4Ghz each) and 7 Gigs of RAM. On my machine I have an Intel Core Duo, and matlab is able to work with my two cores just fine by runinng: matlabpool open ...

Vector Usage in MPI(C++)

I am new to MPI programming,stiil learning , i was successful till creating the Derived data-types by defining the structures . Now i want to include Vector in my structure and want to send the data across the Process. for ex: struct Structure{ //Constructor Structure(): X(nodes),mass(nodes),ac(nodes) { //code to calculate the mass ...

CPU Affinity Masks (Putting Threads on different CPUs)

I have 4 threads, and I am trying to set thread 1 to run on CPU 1, thread 2 on CPU 2, etc. However, when I run my code below, the affinity masks are returning the correct values, but when I do a sched_getcpu() on the threads, they all return that they are running on CPU 4. Anybody know what my problem here is? Thanks in advance! #defi...

Why aren't we programming on the GPU?

So I finally took the time to learn CUDA and get it installed and configured on my computer and I have to say, I'm quite impressed! Here's how it does rendering the Mandelbrot set at 1280 x 678 pixels on my home PC with a Q6600 and a GeForce 8800GTS (max of 1000 iterations): Maxing out all 4 CPU cores with OpenMP: 2.23 fps Running the...

Any task-control algorithms programming practices?

Hi, I was just wondering if there's any field which concerns the task-control programming (or at least that's the way I call it). For a better explanation of task-control consider the following scenario: An application (master-thread) waits for a command - which might be a particular action or a set of actions the application should p...

Why is PLINQ slower than LINQ for this code?

First off, I am running this on a dual core 2.66Ghz processor machine. I am not sure if I have the .AsParallel() call in the correct spot. I tried it directly on the range variable too and that was still slower. I don't understand why... Here are my results: Process non-parallel 1000 took 146 milliseconds Process parallel 1000 took 15...

MPIexec.exe Access denide

I have installed microsoft compute cluster and MPI.net, now i have trouble to run program using mpiexec.exe on cluster. When i try to run it on console i get message: "Access Denied", and pop up: "mpiexec.exe is not valid win32 application". I tried google it, but found nothing. Pls help. :) ...

Why does Clojure hang after having performed my calculations?

Hi all, I'm experimenting with filtering through elements in parallel. For each element, I need to perform a distance calculation to see if it is close enough to a target point. Never mind that data structures already exist for doing this, I'm just doing initial experiments for now. Anyway, I wanted to run some very basic experiments w...

How to Profile R Code that Includes SNOW Cluster

Hi, I have a nested loop that I'm using foreach, DoSNOW, and a SNOW socket cluster to solve for. How should I go about profiling the code to make sure I'm not doing something grossly inefficient. Also is there anyway to measure the data flows going between the master and nodes in a Snow cluster? Thanks, James ...

deciding between subprocess, multiprocessing and thread in Python?

I'd like to parallelize my Python program so that it can make use of multiple processors on the machine that it runs on. My parallelization is very simple, in that all the parallel "threads" of the program are independent and write their output to separate files. I don't need the threads to exchange information but it is imperative tha...

How to printf a time_t variable as a floating point number?

Hi guys, I'm using a time_t variable in C (openMP enviroment) to keep cpu execution time...I define a float value sum_tot_time to sum time for all cpu's...I mean sum_tot_time is the sum of cpu's time_t values. The problem is that printing the value sum_tot_time it appear as an integer or long, by the way without its decimal part! I tried...

How do you save the state of a computation in while using SNOW (or Multicore or ...)

From hard experience I've found it useful to occasionally save the state of my long computations to disk to start them up later if something fails. Can I do this in a distributed computation package in R (like SNOW or multicore)? It does not seem clear how this could be done since the master is collecting things from the slaves in a non...

Parallel version of loop not faster than serial version

I'm writing a program in C++ to perform a simulation of particular system. For each timestep, the biggest part of the execution is taking up by a single loop. Fortunately this is embarassingly parallel, so I decided to use Boost Threads to parallelize it (I'm running on a 2 core machine). I would expect at speedup close to 2 times the...

How can I implement MapReduce using shell commands?

How do you execute a Unix shell command (e.g awk one liner) on a cluster in parallel (step 1) and collect the results back to a central node (step 2)? Update: I've just found http://blog.last.fm/2009/04/06/mapreduce-bash-script It seems to do exactly what I need. ...