openmp

OpenMP for nested for loops?

Hi, I am attempting to use OpenMP in my project that contains N agents in a simulation. Every agent has a position vector. Inside my program, I do something like the following: for(int i=0;i<agents.size();i++){ for(int j=0;j<agents.size();j++){ if(i==j) continue; if(distance_between(i,j)<10){ //do a lot of calculations for ...

Python and OpenMP C Extensions

I have a C extension in which I'd like to use OpenMP. When I import my module, though, I get an import error: ImportError: /home/.../_entropysplit.so: undefined symbol: GOMP_parallel_end I've compiled the module with -fopenmp and -lgomp. Is this because my Python installation wasn't compiled with the -fopenmp flag? Will I have to bui...

OpenMP and File I/O

I'm doing some time trials on my code, and logically it seems really easy to parallelize with OpenMP as each trial is independent of the others. As it stands, my code looks something like this: for(int size = 30; size < 50; ++size) { #pragma omp parallel for for(int trial = 0; trial < 8; ++trial) { time_t start, end; ...

Multi-agent system in C++ code design

Hi everyone, I have a simulation written in C++ in which I need to maintain a variable number of agents, and I am having trouble deciding how to implement it well. Every agent looks something similar to: class Agent{ public: Vector2f pos; float health; float data[DATASIZE]; vector<Rule> rules; } I need to maintain a v...

Can't get OpenMP to Produce More than One Thread

#include <omp.h> #include <stdio.h> int main(int argc, char* argv[]) { omp_set_num_threads(4); printf("numThreads = %d\n", omp_get_num_threads()); } This code prints: numThreads = 1 This is compiled in Visual Studio 2010 Ultimate. I have turned Project Configuration Properties (All Configurations) -> C/C++ -> Language -> Open MP...

OpenMP: Huge slowdown in what should be ideal scenario

In the code below I'm trying to compare all elements of an array to all other elements in a nested for loop. (It's to run a simple n-body simulation. I'm testing with only 4 bodies for 4 threads on 4 cores). An identical sequential version of the code without OpenMP modifications runs in around 15 seconds for 25M iterations. Last nig...

OpenMp Fatal user error 1002

Hi, I'm using a basic omp parallel for, which works most of the time, but sometimes (with no difference in the input data) makes the program crash with the error: Fatal user error 1002 Not all work sharing construct executed by all threads. #pragma omp parallel for \ shared(szDetailedError, aCG) \ private(pCha...

How best do I get to use OpenMP on Mac OS X 10.5 and Ubuntu 10.4?

I'm looking at an open-source library (DDS, a double-dummy bridge solver) which in its latest release (2.1.1) adds some very useful multi-tasking functionality requiring either a Windows system or OpenMP (indeed, that latest version won't even compile at all on a non-Windows system without full OpenMP support!-). Ubuntu 10.4 has a packa...

OpenMP 'slower' on iMac? (C++)

I have a small C++ program using OpenMP. It works fine on Windows7, Core i7 with VisualStudio 2010. On an iMac with a Core i7 and g++ v4.2.1, the code runs much more slowly using 4 threads than it does with just one. The same 'slower' behavior is exihibited on 2 other Red Hat machines using g++. Here is the code: int iHundredMill...

openMP compiler and runtime

Does openMP have a runtime (like .NET CLR on top of operating system) or just a compiler? ...

Thread-safe random number generation for Monte-Carlo integration.

Im trying to write something which very quickly calculates random numbers and can be applied on multiple threads. My current code is: /* Approximating PI using a Monte-Carlo method. */ #include <stdio.h> #include <stdlib.h> #include <math.h> #include <time.h> #include <omp.h> #define N 1000000000 /* As lareg as possible for increased ...

Is it possible to do a reduction on an array with openmp?

Does OpenMP natively support reduction of a variable that represents an array? This would work something like the following... float* a = (float*) calloc(4*sizeof(float)); omp_set_num_threads(13); #pragma omp parallel reduction(+:a) for(i=0;i<4;i++){ a[i] += 1; // Thread-local copy of a incremented by something interesting } // a ...

Converting a simple C code into a CUDA code

Hello, I'm trying to convert a simple numerical analysis code (trapezium rule numerical integration) into something that will run on my CUDA enabled GPU. There is alot of literature out there but it all seems far more complex than what is required here! My current code is: #include <stdio.h> #include <math.h> #include <stdlib.h> #defi...

OpenMP Parallelizing code-block inside a for loop ?

Greetings all, I want to run the code block inside the loop ,in seperate OpenMP thread. Have I defined correct OpenMP directives in the following code snippet: #ifdef OPENMP_ENABLE #pragma omp parallel for #endif for(int i=0;i<numOfSlices;i++){ // Entire block inside this loop should be fun in new OpenMP t...

Why doesn't the OpenMP atomic directive support assignment?

The atomic directive in openmp supports stuff like x += expr x *= expr where expr is an expression of scalar type that does not reference x. I get that, but I don't get why you can't do: #pragma omp atomic x = y; Is this somehow more taxing cpu instruction-wise? Seems to me that both the legal and illegal statement loads the value ...

Crash in program using OpenMP, x64 only

The program below crashes when I build it in Release x64 (all other configurations run fine). Am I doing it wrong or is it an OpenMP issue? Well-grounded workarounds are highly appreciated. To reproduce build a project (console application) with the code below. Build with /openmp and /GL and (/O1 or /O2 or /Ox) options in Release x64 c...

Concurrency and optimization using OpenMP

I'm learning OpenMP. To do so, I'm trying to make an existing code parallel. But I seems to get an worse time when using OpenMP than when I don't. My inner loop: #pragma omp parallel for for(unsigned long j = 0; j < c_numberOfElements; ++j) { //int th_id = omp_get_thread_num(); //printf("thread %d, j = %d\n"...

What are the recommended C++ parallelization libraries for large data processing

Can some one recommend approaches to parallelize in C++, when the data to be acted up on is huge. I have been reading about openMP and Intel's TBB for parallelization in C++, but have not experimented with them yet. Which of these is better for parallel data processing ? Any other libraries/ approaches ? ...

C++: How to parallelize reading lines from an input file when lines get independently processed?

I just started off with OpenMP using C++. My serial code in C++ looks something like this: #include <iostream> #include <string> #include <sstream> #include <vector> #include <fstream> #include <stdlib.h> int main(int argc, char* argv[]) { string line; std::ifstream inputfile(argv[1]); if(inputfile.is_open()) { whi...

Problem with omp_set_num_threads called from a WinAPI thread

I've run into a funny problem using OpenMP v2 under MSVC 9 SP1. when calling omp_set_num_threads from the main thread of execution then using omp_get_num_threads to check the amount set, all works well and checks out. However, in an GUI app, I call the same thing, but its own thread(created with CreateThread), to prevent the UI from be...