It seems like there are so many subfields linked to Machine Learning. Is there a book or a blog that gives an overview of those different fields and what each of them do, maybe how to get started, and what background knowledge is required?
...
I generally try to use unit tests for any code that has easily defined correct behavior given some reasonably small, well-defined set of inputs. This works quite well for catching bugs, and I do it all the time in my personal library of generic functions.
However, a lot of the code I write is data mining code that basically looks for...
Throughout my academic years in computer science I fell in love with many aspects of artificial intelligence. From expert systems, neural networks, to data mining (classification). I wonder, if I was to transform this academic passion professionally, what kind of AI-related jobs are out there?
...
Ok, so here is a problem analogous to my problem (I'll elaborate on the real problem below, but I think this analogy will be easier to understand).
I have a strange two-sided coin that only comes up heads (randomly) 1 in every 1,001 tosses (the remainder being tails). In other words, for every 1,000 tails I see, there will be 1 heads.
...
I have quite a long data mining script, and in parts of it I echo some information to the page (during a foreach loop, actually.)
However I am noticing that the information is being sent to the browse not immediately as I had hoped, but in 'segments'.
Is there some function I can use after my echo to send all the data to the browser im...
I am trying to use the random forests package for classification in R.
The Variable Importance Measures listed are:
-mean raw importance score of variable x for class 0
-mean raw importance score of variable x for class 1
-MeanDecreaseAccuracy
-MeanDecreaseGini
Now I know what these "mean" as in I know their definitions. What I wa...
I would like to know what is the best eopen-source library for crawling and analyzing websites. One example would be a crawler property agencies, where I would like to grab information from a number of sites and aggregate them into my own site. For this I need to crawl the sites and extract the property ads.
...
Situation is that we have branches in every city, selling food.
I feed the time series algorithm with the actual date, as the key time, and total sales of that day, as the input and predict. Predictions are not bad. But,
I would like to know if i can improve the predictions by for example feeding with the number of branches(a new bran...
Hi,
Although there are some threads on here about .net charting controls, I'm starting new thread becuase I'm possibly looking for some advanced data manipulation (maybe this would fall under datamining but I'm not sure) along with charting.
I've been asked to research and prototype and Key Performance Indicators (KPI) system.
Basically...
Hi all
i am using a curl script to go to a link and get its content for further manipulation. following is the link and curl script:
<?php
$url = 'http://criminaljustice.state.ny.us/cgi/internet/nsor/fortecgi?serviceName=WebNSOR&amp;templateName=detail.htm&amp;requestingHandler=WebNSORDetailHandler&amp;ID=368343543';
//cur...
Does anyone know a good algorithm for perform clustering on both discrete and continuous attributes? I am working on a problem of identifying a group of similar customers and each customer has both discrete and continuous attributes (Think type of customers, amount of revenue generated by this customer, geographic location and etc..)
Tr...
I have user login data with timestamps and what I would like to do is get a histogram of logins by year, but with the year starting at an arbitrary date. For example, I want the following sort of information:
1 May 2005 - 30 Apr 2006 | 525
1 May 2006 - 30 Apr 2007 | 673
1 May 2007 - 30 Apr 2008 | 892
1 May 2006 - 30 Apr 2009 | 1047
Th...
Hi
I'm due to take up a project which is into data mining. Before I jump in I wanted to probe around for different data mining tools (preferably open source) which allows web based reporting. In my scenario the all the data would be provided to me, so I'm not supposed to crawl for it.
In n nutshell, am looking for a tool which does - D...
Hi
I'm looking for data mining tools for a project and in line with that I have put up another post in SO. I'm currently looking at different tools and am wondering whether any industry benchmark exists to asses different data mining tools so that I can refer it do a better evaluation of tools.
Please let me know if any such benchmark ...
Hi
I'm a little confused on how to connect BI with data mining. Can BI be termed as some kind of a manifestation of data mining?
How different is a BI tool like Microsoft Analysis Services from a data mining tool like Weka?
I guess BI involves more of reporting and analysis of data, where in the data undergoes some kind of aggregatio...
I have 100 Gb of documents. I would like to characterize it and get a general sense of what topics are prevalent.
The documents are plain text.
I have considered using a tool like Google Desktop to search, but it is too large to really guess what to search ask for and too time consuming to perform enough searches to cover the entire se...
I'm working on a data distribution application which receives data from a source and distributes that data to multiple target application. After successfully distributing several messages each second for 8 days, it missed a single message and did not deliver it properly to the clients.
As I was looking at the logs I tried to find someth...
Hello,
I'm facing a parameter selection problem, which I would like to solve using Genetic Algorithm (GA). I'm supposed to select not more than 4 parameters out of 3000 possible ones. Using the binary chromosome representation seems like a natural choice. The evaluation function punishes too many "selected" attributes and if the number o...
I have to study about data mining using SQL Server. As I know, Business Intelligence in SQL Server supports data mining, but I'm not pretty sure.
Does BI really support data mining?
How can I start with data mining with SQL Server? I mean, resources such as books, blogs,..etc
Thank you all.
...
I have to study about data mining with SQL Server, but I don't know how to begin. Can you suggest me some books written in this subject? some sources of knowledge studied in it?
Thank you in advance.
...