
Recommendations for a data processing (MapReduce / DHT?) framework

I have a need to perform distributed searching across a largish set of small files (~10M) with each file being a set of key: value pairs. I have a set of servers with a total of 56 CPU cores available for this - these are mostly dual core and quad core, but also a large DL785 with 16 cores. The system needs to be designed for online qu...

How to do Distributed Calculations in Excel

A number of years ago I implemented an asynchronous peer-to-peer Message-Oriented-Middle-ware that was very friendly to use in Excel VBA, and I find myself again needing to do lots of calculations which could be trivially distributed, if I had the mechanism. I could re-implement the MOM layer, but I'd prefer to use a third party product...

how to partition the 2d arrays among the processes for "The Game of Life"

I am doing an assignment using MPI to implement Game of Life. I was wondering if I should use a block-row partitioning, a cyclic row partitioning or a block-checkerboard partitioning? ...

Maintaining network integrity in peer-to-peer network

I am looking for information on techniques, algorithms, etc. on how to maintain network integrity in a dynamic peer-to-peer network. Both practical implementations, academic papers and anything else in that category are welcome. Imagine a network that is solely peer-to-peer based where each node is only connect to x other nodes. Without...

Why do I get ErrorCode <ERRCA0022> when I take down one velocity cache host?

I'm getting the following exception in my web app when I take down one node of a three node cluster which is hosting my users’ sessions. The session cache also has secondaries on with no eviction. Here is the error message and stack: Exception information: Exception type: DataCacheException Exception message: ErrorCode<ERRCA...

Any experience programming for BOINC ?

I am attracted by BOINC for a little project of mine. I heard of BOINC but not read much about how it works, mostly because I am focusing on other priorities right now. What I would like to know is if any of you actually tried to program for BOINC and have a program run on the distributed computer network. In particular I am interested i...

Distributed application (WCF/Remoting/web servervices) Vs Web application

Hello all, I am making a medium sized standard LOB application. Currently its a web application but I am formulating a proposal to revamp it into a Desktop remote application. By this I mean that the database and the application server will be hosted in a remote location. The client application will communicate with the server via the i...

BOINC: Is there an easy example how to code a programm for it and how to implement it into their client/server system?

Hi, I did a numeric method as my diploma thesis and coded it in java. It needs a loot of computational time when done adequate. So I looked for an alternative and found BOINC. Unfortunally I didn't have time for doing my method in BOINC, because I'm an aerospace student and not a programmer and I decided to keep my priority on my java p...

Distributed computing using java RMI and CORBA

following is the problem i am facing. i will explain it will an example If there is an IT department that makes use of Java RMI and another department which make use of CORBA, if i happened to integrate those two departments spending least amount of time with least budget what are the approaches I can take could someone help me t...

WCF service registration for distrubuted applications

Hi all, I am in the process of designing a client server app that will use WCF to communicate with the client. It is possible it could become heavily loaded and want to design it in a distributed manner. To that end I have split the application into a number of services. N number of services could be running to deal with the requests...

Running an RMI application from Command Prompt

Hello Stack Overflow... I have been developing a (somewhat complex) application using RMI to read a file and JSON its contents. I have been coding this app on Netbeans 6.7, therefore I have a folder structure as follows: Under C:\MyApp: -build --- classes ------- MyApp <--- That's a package. My .class files are in there. Also, I ran...

Best way to create unique identities for distributed data that will be merged?

I have a centrally hosted database (MS SQL Server) and distributed clients save data to it over the Internet. When the Internet connection goes down the client starts storing new data locally into a SQLite instance. When the Internet connection comes back online the accumulated local data is moved to the central db (inserted). What's th...

Efficient MapReduce when dealing with streams to queries to the same dataset

Hi, I have a massive, static dataset and I've a function to apply to it. f is in the form reduce(map(f, dataset)), so I would use the MapReduce skeleton. However, I don't want to scatter the data at each request (and ideally I want to take advantage of indexing in order to speedup f). There is a MapReduce implementation that address th...

What the heck is a distributed platform, context, embedded systems?

A bit of an academic question here I'm reading about embedded systems and there's a lot of talk about distributed platforms. I'm looking for a definition of what is a distributed platform ,I have a vague semblance of it being when an embedded system controls multiple, disconnected parts, like, in a helicopter, it needs to control the t...

What to do when you've really screwed up the design of a distributed system?

Related question: What is the most efficient way to break up a centralised database? I'm going to try and make this question fairly general so it will benefit others. About 3 years ago, I implemented an integrated CRM and website. Because I wanted to impress the customer, I implemented the cheapest architecture I could think of, wh...

For distributed applications, which to use, ASIO vs. MPI?

I am a bit confused about this. If you're building a distributed application, which in some cases may perform parallel operations (although not necessarily mathematical), should you use ASIO or something like MPI? I take it MPI is a higher level than ASIO, but it's not clear where in the stack one would begin. ...

requiring set of files to be made before running function in Ruffus pipeline

I'm using ruffus to write a pipeline. I have a function that gets called in parallel many times and it creates several files. I'd like to make a function "combineFiles()" that gets called after all those files have been made. Since they run in parallel on a cluster, they will not all finish together. I wrote a function 'getFilenames(...

what is a data serialization system?

according to Apache AVRO project, "Avro is a serialization system". By saying data serialization system, does it mean that avro is a product or api? also, I am not quit sure about what a data serialization system is? for now, my understanding is that it is a protocol that defines how data object is passed over the network. Can anyone he...

Jini : single server with multiple clients

Hi all, I have a question about how to make multiple clients can access a single file located on server side and keep the file consistent. I have a simple PhoneBook server-client Jini program running at the moment, and server only provides some getter functions to clients, such as getName(String number), getNumber(String name) from a Ph...

median of a billion numbers

If you have one billion numbers and one hundred computers, what is the best way to locate the median of the numbers? one solution which i have is: split the set equally among the 100 computers - sort them - find medians for each set. sort the sets on medians. merge two sets at a time starting from the lowest median to highest. if we h...