clustering

Looking for collective intelligence .Net / C# resources

Firstly, I realise that this is a very similar question to this one: http://stackoverflow.com/questions/345982/which-are-the-good-open-source-libraries-for-collective-intelligence-in-net-java ... but all the answers to that one were Java centric so I am asking again, this time looking more for .Net (idealy C#) ideas. A little backgroun...

ORA-01654: unable to extend index

Calling all Oracle Gurus! I am in the process of clustering a well tested application on WebSphere. The application in question made it about half way through processing 1k of JMS messages from a queue before this happened. ---- Begin backtrace for Nested Throwables java.sql.SQLException: ORA-01654: unable to extend index DABUAT.INDEX1...

Hierarchical Image Clustering using Boost Graph Library

I am trying to do image segmentation by bottom-up hierarchical clustering, in order to obtain R regions. Starting with assumption that each pixel is a region, at each iteration two the most similar spatially adjacent regions must be merged together. The image can be of order 500 x 500, leading to a large number of initial regions. I try...

News clustering

How does Google News and Techmeme cluster news items that are similar? Are there any well know algorithm that is used to achieve this? Appreciate your help. Thanks in advance. ...

File download servlet behaving differently with IE on clustered server

I have a servlet that sends a file by setting the HTTP Content-Type to "application/zip", the Content-Disposition to "attachment" and writing it on the response's OutputStream; it behaves correctly when deployed on my local application server, making the browser show the popup to choose wheter or not to download the file. However, when ...

Server Side Google Markers Clustering - Python/Django

After experimenting with client side approach to clustering large numbers of Google markers I decided that it won't be possible for my project (social network with 28,000+ users). Are there any examples of clustering the coordinates on the server side - preferably in Python/Django? The way I would like this to work is to gradually inde...

C/C++ Machine Learning Libraries for Clustering

What are some C/c++ Machine learning libraries that supports clustering of multi dimensional data? (for example K-Means) So far I have come across SGI MLC++ http://www.sgi.com/tech/mlc/ OpenCV MLL I am tempted to roll-my-own, but I am sure pre-existing ones are far better performance optimized with more eyes on code. ...

Scaling with a cluster- best strategy

I am thinking about the best strategy to scale with a cluster of servers. I know there is no hard and fast rules, but I am curious what people think about these scenarios: cluster of combination app/db servers that are round robin (with failover) balanced using dnsmadeeasy. the db's are synced using replication. Has the advantage tha...

In a Tomcat cluster, how to share beans in an application?

This might sound like a dumb or a simple question, but I really have little to no experience with clustering of any kind and I'm just curious if and how a certain scenario is possible. Let's say I've set up a cluster of N Tomcat instances, and I've deployed my application App1 across all N instances. What would I need to do to be able ...

Best clustering algorithm? (simply explained)

Hello! Imagine the following problem: You have a database containing about 20,000 texts in a table called "articles" You want to connect the related ones using a clustering algorithm in order to display related articles together The algorithm should do flat clustering (not hierarchical) The related articles should be inserted into the...

JDBC connection to Oracle Clustered

Hello, I would like to connect to a clustered Oracle database described by this TNS: MYDB= (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = host1)(PORT = 41521)) (ADDRESS = (PROTOCOL = TCP)(HOST = host2)(PORT = 41521)) (LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME= PDSALPO) ...

How to have markers on a map cluster instead of stack

Hello, my team is trying to resolve an issue with limited time. We have developed a fairly complex map interface on our site to display content (trips, social content, etc). when a user runs a search for "Kayaking San Francisco" for example, the map shows all the kayaking trips in SF, but since they are all geotagged (using geonames.or...

SQL Server 2005 Failover Cluster Using One Server?

Hi, I am developing an application that is hosted on a SQL Server 2005 failover cluster. The application (developed using C#, .Net 2.0) makes use of a number of the clustered resources (printers, file shares, etc). I would like to set up a testing environment that replicates the cluster. However, the current test environment has only...

Tiles and Map Points Clustering

Are there any examples with code that illustrate clustering of map points using tiles? ...

"Beginner" distributed processing project.

For the longest time I've been interested in building a cluster of heterogeneous nodes in an attempt to have a home super computer since I am very interested in doing AI research. However, the issue is even though I have a myriad of hardware, (2x dual quad rack mount servers, 8 285GTX Gpus, 6x PS3s 2x Hacked 360s (they can run linux) a...

How do I visualize a large document set?

I have 100 Gb of documents. I would like to characterize it and get a general sense of what topics are prevalent. The documents are plain text. I have considered using a tool like Google Desktop to search, but it is too large to really guess what to search ask for and too time consuming to perform enough searches to cover the entire se...

Google Maps Tiles - how to divide 30k coordinates into tiles

Is it possible and effectively divide 30k (and growing) or more coordinates into tiles on Google Maps? My goal is to index all coordinates - assign to a tile and than define a tile size for each zoom level. On the client side marker manager will do the clustering for each tile (which ideally would have less than 200 points) New coordi...

Load-balanced MySQL cluster without load balancer

Hi. I'm looking to create a load-balanced MySQL cluster, but without the actual load-balancer, in order not to add another point of failure or complexity. What I was thinking was to have the following: 1) Have a master-master setup for MySQL 2) On every client, place a simple round-robin proxy which would rotate the requests between ...

Compare lots of texts (clustering) with a matrix

Hello! I have the following PHP function to calculate the relation between to texts: function check($terms_in_article1, $terms_in_article2) { $length1 = count($terms_in_article1); // number of words $length2 = count($terms_in_article2); // number of words $all_terms = array_merge($terms_in_article1, $terms_in_article2); ...

Apache httpd cluster logging

I have a cluster of Apache httpd servers. It's a load balanced cluster where all nodes serve the same, shared, content. The content itself is located on a shared storage. I would like to setup all nodes to log (server access logs) to the same log file (again on the same shared storage), but I am concerned that this would create concurre...