parallel-processing

Research topic on distributed systems

Hello there. I have a research project on distributed systems, I asked the Prof. if i can work on MapReduce and he is giving me hard time that MapReduce is very broad and asked me to pick a specific problem about either distributed systems frameworks like MapReduce or something else that has networking and distributed computing in it. ...

(When) are parallel sorts practical and how do you write an efficient one?

I'm working on a parallelization library for the D programming language. Now that I'm pretty happy with the basic primitives (parallel foreach, map, reduce and tasks/futures), I'm starting to think about some higher level parallel algorithms. Among the more obvious candidates for parallelization is sorting. My first question is, are p...

Effective ways to implement Every-to-Every interaction?

Given a list of elements, how to process all elements if every element requires knowledge about states of every other element of this list? For example, direct way to implement it in Python could be: S = [1,2,3,4] for e in S: for j in S: if e!=j: process_it(e,j) but it is very slow O(n²) if number of elements is huge. Th...

boost::thread_group - is it ok to call create_thread after join_all?

Hi, I have the following situation: I create a boost::thread_group instance, then create threads for parallel-processing on some data, then join_all on the threads. Initially I created the threads for every X elements of data, like so: // begin = someVector.begin(); // end = someVector.end(); // batchDispatcher = boost::function<void...

Does Parallel::ForkManager() module support synchronization on global variables?

I'm very new to this Parallel::ForkManager module in Perl and it has a lot of credits, so I think it support what I need and I just haven't figured out yet. What i need to do is in each child process, it writes some updates into a global hash map, according to the key value computed in each child process. However, when I proceed to cla...

Parallel programming on a Quad-Core and a VM?

I'm thinking of slowly picking up Parallel Programming. I've seen people use clusters with OpenMPI installed to learn this stuff. I do not have access to a cluster but have a Quad-Core machine. Will I be able to experience any benefit here? Also, if I'm running linux inside a Virtual machine, does it make sense in using OpenMPI inside a ...

Maintaining session in an Eventlet page scraper?

Hello, I'm trying to do some scraping of a site that requires authentication (not http auth). The script I'm using is based on this eventlet example. Basically, urls = ["https://mysecuresite.com/data.aspx?itemid=blah1", "https://mysecuresite.com/data.aspx?itemid=blah2", "https://mysecuresite.com/data.aspx?itemid=blah3"] impo...

How can I improve garbage collector performance of .NET 4.0 in highly concurrent code?

I am using the task parallel library from .NET framework 4 (specifically Parallel.For and Parallel.ForEach) however I am getting extremely mediocre speed-ups when parallelizing some tasks which look like they should be easily parallelized on a dual-core machine. In profiling the system, it looks like there is a lot of thread synchroniz...

Instruction-Level-Parallelism Exploration

Hi all, I am just wondering if there are any usefuls tools out there that allow me to exploit the Instruction-Level-Parallelism in some algorithms. More specifically, I have a subset of algorithms from the multimedia domain and I wonder what is the best way to exploit ILP in this algorithms. All this algorithms are implemented in C, so ...

First-Occurrence Parallel String Matching Algorithm

To be up front, this is homework. That being said, it's extremely open ended and we've had almost zero guidance as to how to even begin thinking about this problem (or parallel algorithms in general). I'd like pointers in the right direction and not a full solution. Any reading that could help would be excellent as well. I'm working on ...

When will simple parallization not offer a speedup?

I have a simple program that breaks a dataset (a CSV file) into 4 chunks, reads each chunk in, does some calculations, and then appends the output together. Think of it as a simple map-reduce operation. Processing a single chunk uses about 1GB of memory. I'm running the program on a quad core PC, with 4GB of ram, running Windows XP. ...

Threading vs. Parallel Processing

Microsoft .NET 4.0 introduces new "parallel enhancements" to its framework. I am wondering what the difference between making an application that uses the standard System.Threading functions versus the new parallel enhancements. ...

The missing "Comparison of Parallel Processing API". How do I choose Multi-threading library?

I'm using the phrases Parallel Processing & Multi Threading interchangeably because I feel there is no difference between them. If I'm wrong please correct me. I'm not a pro in Parallel Processing/Multi-threading. I'm familiar with & used .NET threads & POSIX Threads. Nothing more than that. I was just browsing through archives of SO ...

Message Passing Arbitrary Object Graphs?

I'm looking to parallelize some code across a Beowulf cluster, such that the CPUs involved don't share address space. I want to parallelize a function call in the outer loop. The function calls do not have any "important" side effects (though they do use a random number generator, allocate memory, etc.). I've looked at libs like MPI...

Bash: Subprocess access variables

I want to write a Bash-Script which loggs into several machines via ssh and first shows their hostname and the executes a command (on every machine the same command). The hostname and the output of the command should be displayed together. I wanted a parallel version, so the ssh-commands should be run in background and in parallel. I co...

increment a count value outside parallel.foreach scope

How can I increment an integer value outside the scope of a parallel.foreach loop? What's is the lightest way to synchronize access to objects outside parallel loops? var count = 0; Parallel.ForEach(collection, item => { action(item); // increment count?? } ...

Does this program introduce a parallel execution?

Here is a simple server application using Bonjour and written in Java. The main part of the code is given here: public class ServiceAnnouncer implements IServiceAnnouncer, RegisterListener { private DNSSDRegistration serviceRecord; private boolean registered; public boolean isRegistered(){ return registered; } ...

Origin of "embarrassingly parallel" phrase

For the purposes of history on wikipedia, is anyone familiar with the origin of the phrase "embarrassingly parallel". I've always thought that it may have been coined by a random Google employee who first worked on map-reduce. Does anyone have any concrete info on the origin? ...

Perl Parallel::ForkManager wait_all_children() takes excessively long time

I have a script that uses Parallel::ForkManager. However, the wait_all_children() process takes incredibly long time even after all child-processes are completed. The way I know is by printing out some timestamps (see below). Does anyone have any idea what might be causing this (I have 16 CPU cores on my machine)? my $pm = Parallel::For...

Would this method work to scale out SQL queries?

I have a database containing a single huge table. At the moment a query can take anything from 10 to 20 minutes and I need that to go down to 10 seconds. I have spent months trying different products like GridSQL. GridSQL works fine, but is using its own parser which does not have all the needed features. I have also optimized my databas...