This is a small task to test your understanding of various standard data mining methods. The real stuff will be given to those who can do this task well (so don't be shy to use Zero R mentioned below).
A small set of data (173 entries) for the data mining task will be given to the accepted bidder. The task **must** be done using [**WEKA**][1] v3.4.4. The data contains 35 attributes (both nominal and numeric).
You are free to use any **combination** of the following data mining methods available in WEKA to **analyze** (find any meaningful, human understandable rules from) the data:
Zero R
One R
Linear Regression
Naive Bayes
ID3 Decision Tree
C4.5 Decision Tree
Association Rules
Prism
And to use any combination of the following to validate the correctness of your findings:
Holdout
*n*-Fold Cross-Validation
Leave-One-Out Cross-Validation
Additional methods similar to the above can be also used as well. Other methods available in WEKA that are _not similar_ to the above should _not_ be used (e.g. Multilayer Perceptron).
You are required to give a detailed account of the methods and algorithms you **used and not used** to analyze the data, how and why you tune the parameters (if any) of the algorithms, along with your analysis and findings in a **9 to 10-page report** in MS Word or [login to view URL] format, using **10-pt Times New Roman** for the font. (Please be reminded that *a 9 to 10-page 10-pt report is almost as long as a 20-page 12-pt report*.) You are welcome to include illustrative figures in the report for explanations for algorithms and visualization of the findings.
**Only those with experience with WEKA will be considered.**
## Deliverables
1) A 9 to 10-page report in MS Word format or OpenOffice.org. Font: Times New Roman, 10-pt. The contents of the report is described in the Description section above.
## Platform
MS Word or [login to view URL]