Many algorithms for clustering are available. A popular algorithm is the K-means where, based on a given number of clusters, the algorithm iterates to find best clusters for the objects.
What method do you use to determine the number of clusters in the data in k-means clustering?
Does any package available in R contain the V-fold cros...
Hi,
I'm looking for a way to data mine the event logs of a remote computer in C#.
The problem I have is that I'm working with Amazon web services and in production we use the auto-scaler to bring up/shut up live virtual machine instances as necessary. However, the web services we have running on these instances all log to its local eve...
I need to capture product data from a site on a regular basis and wondered if any one knows of a good software program? I've trialed Mozenda
but its a monthly subscription and pricey in the long term. Obviously something thats free would be best but I don't mind paying either. Just need a decent program thats reliable and doesn't require...
I make lot of dealing with RFID cards. As much as there are different readers there are different outputs and coding of same type of cards.
I got frequent request to figure out (if possible) to translate one output to another and that means that I have to stare at these numbers and figure out what transformations are.
Most common transf...
I am building a website that will allow you to find restaurants upto a certain distance from your house/or office. ineed to collect a database of all the restaurants.
The criteria is based on the below details
1: Maximum distance you can walk/drive from a location
2: cusines of your choice.
i need Restaurants Name, Phone number, addres...
This is not a directly programming related question, but it's about selecting the right data mining algorithm.
I want to infer the age of people from their first names, from the region they live, and if they have an internet product or not. The idea behind it is that:
there are names that are old-fashioned or popular in a particular ...
Hey all!
My web app needs to access an arbitrary E-Commerce store and determine whether or not it has a product data feed (i.e. a Google Base feed; an RSS/ATOM feed of all products in the store). Also, I need to extract the location of this feed.
The best solution I can think of so far is to maintain a comprehensive list of known loca...
I'm researching Medical Data set which includes variable concerning illnesses and treatment type.
For example illnesses is colon cancer, it's decision variables (x,y,z,t) and treatment type is chemothreapy, radiothreaphy etc etc.
I want to reach such a data set for my KDD and exploratory lesson. Because I want to make useful p...
There are two parameters while using RBF kernels with Support Vector Machines: C and γ. It is not known beforehand which C and γ are the best for one problem; consequently some kind of model selection (parameter search) must be done. The goal is to identify good (C;γ) so that the classier can accurately predict unknown data (i.e., testin...
Hi
I would like to find some tutorial about the trading algorithms like
Iceberg, Dagger, Guerrilla etc.
I have just found some non-free or marketing sites on this topic.
...
I am currently doing some kind of reporting system.the figures, tables, graphs are all based on the result of queries. somehow i find that complex queries are not easy to maintain, especially when there are a lot of filtering. this makes the query very long and not easy to understand. And also, sometimes, queries with similar filters are...
Hello,
I need some help in solving this problem.
We have a large amount of documents of a given specified domain. These documents are from differente sources and therefore their structure can be very different too. On the other side I have a table with some specified fields where some figures has to be filled from the extract of the do...
Hi;
I have n documents and want to find common words that are included in these documents.
For example I want to say (n-3) documents include the word "web".
Certainly I can do this by basic data structures but there maybe efficient algorithm or a way to handle same words with different suffix.
Is there any algorithm for such purposes?...
Hi folks,
I've built a content aggregator and would like to add a tag cloud representing the current trends.
Unfortunately this is quite complex, as I have to look for keywords that represent the context of each article.
For example words such as I, was, the, amazing, nice have no relation to context.
Help would be much appreciated...
Hello
What I want to do is to apply Association method of data mining on my SQL Server 2000 database. Association rule is something like "finding the most frequent items that appear together in database."
For those who don't know or who want to remember what is association method is like, take a look at this presentation about Associa...
Is there a way for an Android user to browse the SQLite databases on his/her phone and view the data in the databases?
I use the SoftTrace beta program a lot. It's great but has no way that I can find to download the data it tracks to a PC.
Thanks
...
Hello,
I Begin with textmining.
I have two database tables with thousands of data..
a table for "skills" and a table for "skills categories"
every "skill" belongs to a skills categorie.
a "skill" is , physicaly, a varchar(200) field in the database, where there is some text describing the skill.
Here are some skills extracted from ...
Hi i just like to know is there any open source data mining software written in java that is approximately less than 3k lines of codes?
If yes, please give download link
i need to do software testing
thank you.
...
I am an occasional Python programer who only have worked so far with MYSQL or SQLITE databases. I am the computer person for everything in a small company and I have been started a new project where I think it is about time to try new databases.
Sales department makes a CSV dump every week and I need to make a small scripting applicati...
Hi all
I'm interested in the problem of patterning mining among players of social networking games. For example detecting cheaters of a game, given a company's user database. So far I have been following the usual recipe for a data mining project:
construct a data warehouse that aggregates significant information
select a classifier...