Write some Software
This project was successfully completed by abhijitbuet for $40 USD in a day.Get free quotes for a project like this
Project Budget$10 - $30 USD
Completed In1 day
two data points x and y whose attributes are all numeric, the Euclidean distance kx − yk is a popular
measurement. To be consistent with the textbook, we assume a vector x by default is a row vector.
However, Euclidean distance is not effective if
• the ranges (scales) of attributes are different, or
• there exist correlations among attributes.
Mahalanobis distance has been introduced to address the above problems. Given a dataset whose
attributes are all numeric, we first construct the the covariance matrix Σ = (si,j )n×n, where an element
si,j is the covariance of i-th and j-th attributes, and n is the number of attributes. The Mahalanobis
distance of x and y is then defined as
M ahalanobis(x, y) = (x − y)Σ−(x − y)
− is the inverse of the matrix Σ, and (x − y)
is the transpose of the row vector x − y.
In the part, you will first implement the above two distance measurements, and then give an experimental
evaluation of these measurements. Usually, evaluation is performed in a specific data mining
tasks, such as classification, clustering. Since we currently have not yet got into these topics, we will
simply check the consistency between two measurements based on random datasets. Specifically, you
will need to
1. generate a random dataset of two-attribute instances in Gaussian distribution
2. build the covariance matrix Σ
3. set counter = 0
4. for each instance,
(a) Compute the nearest neighbor using Euclidean distance
(b) Compute the nearest neighbor using Mahalanobis distance
(c) if the nearest neighbors are same, count++
5. output the consistency ratio (count / number-of-instances)
You will need to read the documentation of the packages [url removed, login to view] and [url removed, login to view], and
typically the class Matrix to complete this part. To generate data in Gaussian distribution, you may
refer to the standard Java class Random, and use the Java method nextGaussian().
Looking to make some money?
- Set your budget and the timeframe
- Outline your proposal
- Get paid for your work
Hire Freelancers who also bid on this project
Looking for work?
Work on projects like this and make money from home!Sign Up Now
- The New York Times
- Wall Street Journal
- Times Online