Binning in Excel
Which formulae in MS Excel can we use for - equi-depth binning equi-width binning ...
Which formulae in MS Excel can we use for - equi-depth binning equi-width binning ...
How would you perform binarization of an attribute with five categorical values in excel? ...
I'm using Perl. I have the tag, for example: "XYZ_PKM_HTML" I would like to be able to provide a base url, for example: www.example.com and the to get the HTML page (not necessarily the main page, thats easy) where this tag appears. is it possible? any idea? (or already made modules, looked on cpan, there were some interesting stuff, bu...
if I want to build a complex webiste like google news , which gathers data from oher websites. like data mining , crawling. In which language should i build the website. Currently i know only PHP. Can i do that in PHP ...
My company got the project to build simple website of grocery shop with catalogue only without shop cart. Few days ago i read something about data mining from here I found that it is possible to do some predictive modelling like For example, one Midwest grocery chain used the data mining capacity of Oracle software to analyze local bu...
Someone has just told my boss what data mining can do to a company like recommendation , predictive modelling. Basically we are a website company. I am going on leave for 6 months. So my boss said that I can learn some DM techniques so that when I come back we can visit small shops or small companies to provide them with predictive data ...
Suppose I want to do some data mining on the database of a supermarket. What does that actually mean? 1) What will the output/results be like? 2) Will the output be different every day or change over time? 3) Before applying data mining, do I need to know what I want or will data mining give everything I want automatically? ...
Hello there, I've just used Weka to train my SVM classifier under "Classify" tag. Now I want to further investigate which data samples are mis-classified,I need to study their pattern,but I don't know where to look at this from Weka. Could anyone give me some help please? Thanks in advance. ...
Hi, I'm trying to calculate how good are my measurements in machine learning! Let's say that I have five choices, and that error is 4,2, 0.002, 3, 6. Naturally, I will pick third one for the hit, but I would like to say following: I'm X% certain that hit is third pick I'm Y% certain that hit is first (last) pick Of course, X>>Y but I ...
Hi, I'm trying to reduce dataset dimension. PCA is a good metric but that gives me new dataset. My goal is to determine from number of events (e.g. 60) and number of trials (e.g. 6) which events are more relevant. For example: 1st, 3rd, 21st, 45th ... (N total) events are good enough to approximate behavior of dataset. That will al...
Hello there,recently I came across this term,but really have no idea what it refers to.I've searched online,but with little gain. Thanks. ...
Hi all, I have return a web services which return "Instances" from a datamining api. Now the problem is obvious web services by default cannot handle "Instances" as return type. What should be my approach. Or I may have to say User defined data types, please guide me of any documentation where I can implement this. //////////////...
Hi all, now I have a seemingly easy but challenging task.I need to develop a data set of questions,and I classify the questions into two categories: Factoid questions: "who is the current president of France." Free questions: "Can you rate the cameras below for me,please?" now I need to know the percentage of both categories on Yaho...
Hello there. I wonder what is the best way to sample,say, 1000 questions,completely randomly from Yahoo! Answer. I want to achieve this complete randomness in which I will totally ignore the categories or date of posting etc. Doing this manually may result in bias,so could anyone give some suggestions here,like using Yahoo! Answer API or...
What does ODM(Oracle Data Miner) do? Can you give me useful materials or a brief information about this option? Thank you.. ...
We have had a production web based product that allows users to make predictions about the future value (or demand) of goods, the historical data contains about 100k examples, each example has about 5 parameters; Consider a class of data called a prediciton: prediction { id: int predictor: int predictionDate: date p...
Hi guys, I am doing an application that will compute all 2 size frequent itemset from a set of transactions. That is the application will have as input a data file (space delimited text file - with the items encoded as integers) and a percentage, given as an integer (e.g. input 2 represents 2%). The application will output in a distinct...
Hi everyone, I am looking for SCAD (Simultaneous clustering and attribute discrimination) subspace clustering algorithm. If anyone has implemented it, please let me know where I can find/download this algorithm. Thank you. ...
Hi, I am intending to use the n-gram part/algorithm of this code: http://www.codeproject.com/KB/cs/tfidf.aspx The algorithm produces these tri-gram results: t th the he e q qu qui uic ick ck k r re red ed d for: the quick red However, this source: http://en.wikipedia.org/wiki/Trigram reckons it should be: the qui k_r he_ u...
Hi, I am working on a possible architecture for an abuse detection mechanism on an account management system. What I want is to detect possible duplicate users based on certain correlating fields within a table. To make the problem simplistic, lets say I have a USER table with the following fields: Name Nationality Current Address Logi...