clustering

How to implement this Queue in a Jboss cluster?

Hello everybody, my application works as a middle ware, receiving requests from clients, then transforming it in certain logic and sending the transformed requests to another service provider as normal HTTP requests or Webservice soap requests. The application was deployed on two jboss servers (in cluster) behind a load balancer. let's...

Problems with weblogic 9.2 load balancing and clustering using proxy plugin

Hi all, I have a cluster in weblogic 9.2 with 2 nodes(172.20.1.68:7101, 172.20.1.23:7102), 1 adminserver (172.20.1.23:7001) and 1 balancer (apache proxy plugin) 172.20.1.49:7103. What I see in the balancer's access.log is that every request is marked as 404 not found. But in the node's log I can see those very same request distribuited...

Grouping an ordered dataset into minimal number of clusters

I have an ordered list of weighted items, weight of each is less-or-equal than N. I need to convert it into a list of clusters. Each cluster should span several consecutive items, and total weight of a cluster has to be less-or-equal than N. Is there an algorithm which does it while minimizing the total number of clusters and keeping th...

EJABBERD: Connection attempt from disallowed node

Having and issue with an ejabberd cluster. While trying ping the first node from the second, I'm getting : "Connection attempt from disallowed node" I know it's not a cookie issue, because the cookies on both servers match. ...

Servlets should not start threads due to issues that may arise when clustering ....what issues ?

I know that we should not start threads in a servlet is that threads should be managed by the container. If the container is told to shutdown if there are threads that it does not know about hanging around it wont shutdown. I take care of this by making it a daemon thread... But other than the above "unable to shutdown" situation what o...

KMeans clustering for more than 5 million vectors

I have hit a real problem. I need to do some Kmeans clustering for 5 million vectors, each containing about 32 cols. I tried out Mahout which requires linux and I am on windows, I am restrained from using a Linux OS and any sort of simulator. Can anyone suggest a KMeans clustering algorithm that is scalable upto 5M vectors and can con...

Scaling a Ruby on Rails site

I am developing a Ruby On Rails application and would like to deploy in a production environment. I have multiple identically configured Ubuntu web servers I can use but I don't know how to scale the RoR app and db data across multiple hosts. I'd like to put both a web server and a db server on each host. On the web server/ruby middle...

JBoss 4.2.2 nodes start to cluster then suspect each other

Hi, I have a website running with JBoss 4.2.2 on an existing Red Hat server. I'm setting up a second server so as to have a clustered pair (which will then be load-balanced). However, I can't get them to cluster successfully. The existing server starts up JBoss with: run.sh -c default -b 0.0.0.0 (I know the 'default' configuration d...

Calculating similarity between and centroid of Lucene documents

In order to perform a simple clustering algorithm on results that I get from Lucene, I have to calculate Cosine similarity between 2 documents in Lucene, I also need to be able to make a centroid document to represent the centroid of each cluster. All I can think of doing is building my own Vector Space model with tf-idf weighting, usi...

Loadbalancer Caching problems

Hi, I am using Apache Webserver 2.x with mod_proxy and mod_proxy_balancer for load-balancing two Jboss servers. Everything is working fine except one weird problem. The application is using Flex as UI technology and it consists of various modules (separate SWF files for each module). When I switch between modules, some of the modules a...

Need info on database cluster

My understanding of database cluster is less because I have not worked on them. I have the below question. Database cluster has two instance db server 1 & server 2. Each instance will have a copy of databases, considering the database has say Table A. Normally a query request will be done by only one of the servers which is randomly de...

JBoss 5.1 message replication

I have a multi client application with some unique network requirements. We have several server instances running, and each server has several clients. Server A Client 1 Client 2 Server B Client 3 Client 4 I want to put a JBoss 5.1 server on server A and server B with JBoss Messaging configured. Both servers will have the s...

SQL Server Clustering limit?

Can you do SQL Server Clustering of more than 2 servers? Master can have more than 1 slave? What is the limit? ...

3D clustering Algorithm

Problem Statement: I have the following problem: There are more than a billion points in 3D space. The goal is to find the top N points which has largest number of neighbors within given distance R. Another condition is that the distance between any two points of those top N points must be greater than R. The distribution of those point...

Techniques for probabilistic clustering of similar looking text data?

I have 20,000 company addresses on various documents, which are all formatted differently. For example: Company A 12345 street US CompanyA, Inc box2, 12345 street WA, US The Company B company Ltd 123 happy street UK company B, Ltd 123, happy street, london, S1 1AA I'd like to be able to combine the records for each company (i.e. sepe...

Module clustering and JMS

Hi, I have a module which runs standalone in a JVM (no containers) and communicates with other modules via JMS. My module is both a producer in one queue and a consumer in a different queue. I have then need to cluster this module, both for HA reasons and for workload reasons, and I'm probably going to go with Terracotta+Hibernate for cl...

How to cluster time series data using K-means algorithm?

Hi, I am wondering how can I do clustering of time series data. I understand if the data is a point. But I do not know how to cluster if the data is time series with 1XM where M is the data length. Especially the part on how to compute new mean of the cluster for time series data. My X matrix will be N X M where N is number of time ser...

Newman's modularity clustering for graphs

Hello, I am interested in running Newman's modularity clustering algorithm on a large graph. If you can point me to a library (or R package, etc) that implements it I would be most grateful. best ~lara ...

How to export a clustering result from Weka

Hi I'm new to Weka, using it to analyse the user attributes based on user ID. the raw data may looks like this, [userid->game coin] 10001-> 100 10002-> 501 ... i am trying to do a K-Mean Clustering on [game coin] and sort the data into some groups and, is it possible to save the sorted [userid] results, just as some non-overlapped c...

Great articles/videos/... on non-ACID (distributed) systems? ("Eventually Consistent" etc.)

I'll start with these - IMO brilliant - articles: Base: An Acid Alternative - by Dan Pritchett (eBay), 2008 Eventually Consistent (- Revisited) - by Werner Vogels (Amazon), 2008 Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services (non-free) - by Seth Gilbert, Nancy Lnych (MIT), 2002 I'm i...