cluster

MapReduce skipping keys?

I'm running a local, single-system test using Qizmt of a simple MapReduce operation. At the end of the 'Map' phase I am calling: output.Add(rKey, rValue); This is called let's say a million times, and the keys are 1,2,3,4,5,6 etc - each unique (I'm just testing, after all). I've checked that this is happening as intended. It is. The f...

1) Can SQLite use cluster? 2) Does SQLite have anything like MySQL Workbench? 3) How to import Excel to SQLite?

1) If MySQL have cluster option, I wonder if SQLite have that option too. 2) I really haven't learn to write in MyQSL. So I do is use the Workbench. If I turning to SQLite, does it have something similar? 3) How to import Excel to SQLite? ...

how to cluster evolving data streams

Hi Guys, I want to incrementally cluster text documents reading them as data streams but there seems to be a problem. Most of the term weighting options are based on vector space model using TF-IDF as the weight of a feature. However, in our case IDF of an existing attribute changes with every new data point and hence previous clusterin...

Load Balancer need info

HI all, I would like to know the internal working of load balancer. Considering I am using either hardware or web server for load balancing, does the load balancer process request parallely. Scenario: Database cluster with load balancer Request 1 and Request 2 both are querying the database for different table at the same time. Consi...

Records deleted from quartz tables on application shutdown in a cluster

We are using quartz for your scheduling needs in our application. The application runs in a cluster. On application shutdown, we call org.quartz.Scheduler.shutdown() to shutdown the scheduler. What we find is that this deletes all records in the database tables(we use a org.quartz.impl.jdbcjobstore.JobStoreTX as our job store. My initi...

Move application to Websphere clusters

What should we take care of before moving an application from a single Websphere Application Server to a Websphere cluster ...

multi computer map-reduce in C#

Is there a simple Map-Reduce library or implementation for .NET that allows a task to start on one computer and be split amongst multiple worker computers, perhaps using WCF or something else bit more efficient to manage the inter machine communication? I looked at Microsoft's Dryad but from the docs it seems it is more intended for lo...

Real-life grid/distributed-computing application source codes etc..?

Hi, I started-off by making a chess AI engine (simple min-max), and ended-up trying to develop something like BOINC, but MUCH more lighter/re-distributable and most importantly decentralized. I was searching for sample applications to run on it and test it, but all I could find was some "impossible to understand" pi value calculators and...

SqlException timeout on GetConnection, but only from one node in a cluster

We have two nodes in a cluster. Both run an ASP.NET web application that connects to a database on another server. Node1 has no problems, but Node2 throws SqlExceptions, stating there's a timeout. The stacktrace shows me it's on DbConnectionPool.GetConnection. I checked the versions of our DLLs, the web.config files, the connection stri...

Accessing dependent files without sharing in condor

Hi Everyone I have 6 windows machine on which condor can run the jobs, when I'm running the interdependent files(one file calling other file) on condor,I'm supposed to share(requires Administrative access,) the calling file to everyone on the machine where i'm running the jobs, and it is happens that the submitted file generates output...

Utilizing the power of clusters in the context of databases?

I have a 22 machine cluster with a common NFS mount. On each machine, I am able to start a new MySQL instance. I finished creating a table with about 71 million entries and started an ADD INDEX operation. It's been more than 12 hours and the operation is still going on. So what I logged onto one of my other machines in the cluster, start...

Java How to invalidate user session when he logs twice with same credential

Hallo all. I found this interesting thread about how to invalidate a user session when he logs twice. http://stackoverflow.com/questions/2372311/jsf-how-to-invalidate-an-user-session-when-he-logs-twice-with-the-same-credentia I have a slightly different environment but I should resolve the same problem. The differences are that I don't...

Websphere 7 clustered deployment

Hi, We have a J2EE application as EAR file which is deployed in WAS 7, for making the application availability as high it needs to be deployed in 3 clusters. We have a Quartz Scheduler class whose job is to upload data from one database to another daily at 2:00 am. Now, the problem is if the ear will be deployed in 3 different nodes fo...

Python for indexing and searching using a cluster?

After an unfortunate misadventure with MySQL, I finally gave up on using it. What I have? Large set of files in the following format: ID1: String String String String ID2: String String String String ID3: String String String String ID4: String String String String What I did? Used MySQL on a powerful machine to import everything ...

Something similar to ParallelPython for C++?

I need to do some extensive searching and string comparisons and for this I figure that a compiled program is much better than an interpreted ones especially after seeing some comparison studies. I came across ParallelPython which was beautiful. It has autodiscovery for clusters and can pretty much do all the load balancing for me as wel...

Jobs are getting lost in ParallelPython?

I am submitting about 234 jobs (but my example contains only 50 for demonstration purpose) to my 20 node cluster using ParallelPython. I was expecting that it would queue and execute them but it seems to "lose" jobs and I am not understand where things are going wrong. When the script finishes, I am not able to see 50 files i.e. info_1, ...

Is there a good way to submit jobs to a cluster using bash?

Is there a good tool out there to do this on a Linix machine using the bash shell? All I need is to issue different commands on a set of nodes in a cluster and when one of them is done with the job, I'd like to submit another one. Something very similar to what Hadoop can do. I would be interested in knowing the status of the job as well...

Can we add MySQL Indexes without the server and client?

I was given access to a cluster today along with a front-end. The person who gave me access tells me I cannot start anything on the front-end and that I should submit everything as a job. Now I have no idea what that means but I'm thinking that I am not supposed to start MySQL on the front-end. If that is the case, how can I even use the...

HELP: MySQL Cluster is much slower than NDB

I have a denormalized table product with about 6 million rows (~ 2GB) mainly for lookups. Fields include price, color, unitprice, weight, ... I have BTREE indexes on color etc. Queriy conditions are dynamically generated from the Web, such as select count(*) from product where color=1 and price > 5 and price <100 and weight > 30 ... et...

how to automatically run a bash script when my qsub jobs are finished on a server?

I would like to run a script when all of the jobs that I have sent to a server are done. for example, I send ssh server "for i in config*; do qsub ./run 1 $i; done" And I get back a list of the jobs that were started. I would like to automatically start another script on the server to process the output from these jobs once all are c...