comirva.audio.util.kmeans
Class KMeansClustering

java.lang.Object
  extended by comirva.audio.util.kmeans.KMeansClustering

public class KMeansClustering
extends Object

This class implements a simple version of the k-Means Clustering algorithm. A given list of points in a n-dimensional space will be grouped in a specified number of clusters.

To measure the distance between two points in the n-dimensional space the euclidian distance is used.

The clusters are initialized by randomly choosing one point of the given point list for each cluster. Then the algorithm runs till there is no more improvement to the mean quantization error (MQE).

See Also:
InvalidClusteringException, PointList

Nested Class Summary
protected  class KMeansClustering.Cluster
          A set of homogen points in the n-dimensional space is represented by a Cluster object.
 
Field Summary
protected  KMeansClustering.Cluster[] clusters
           
protected  Matrix[] covariances
           
protected  PointList data
           
protected  int dimension
           
protected  boolean existsClustering
           
protected  int maxIterations
           
protected  boolean normalize
           
protected  int numberClusters
           
protected static Random rnd
           
 
Constructor Summary
KMeansClustering(int numberClusters, PointList pl)
          Constructs an object that allows to separate a point list into k clusters.
KMeansClustering(int numberClusters, PointList pl, boolean normalize)
          Constructs an object that allows to separate a point list into k clusters.
KMeansClustering(int numberClusters, PointList pl, boolean normalize, int maxIterations)
          Constructs an object that allows to separate a point list into k clusters.
 
Method Summary
protected  void createFullCovarianceMatrices()
          Creats the full covariance matrices.
 double getClusterWeight(int cluster)
          Returns the weight of the specified cluster.
 double[] getClusterWeights()
          Returns the weights of all clusters in one array.
 Matrix getDiagCovarianceMatrix(int cluster)
          Returns the diagonal covariance matrix of the specified cluster.
 Matrix[] getDiagCovariances()
          Returns the diagonal covaraince matrices of all clusters in one array.
 Matrix getFullCovarianceMatrix(int cluster)
          Returns the full covariance matrix of the specified cluster.
 Matrix[] getFullCovariances()
          Returns the full covaraince matrices of all clusters in one array.
 Matrix getMean(int cluster)
          Returns the mean of the specified cluster.
 Matrix[] getMeans()
          Returns the mean vectors of all clusters in one array.
 int getNumberClusters()
          Returns the number of clusters to separate the given data points into.
 void print()
          For debugging purpose only.
 void run()
          Starts the k-mean clustering algorithm.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

rnd

protected static Random rnd

maxIterations

protected int maxIterations

numberClusters

protected int numberClusters

dimension

protected int dimension

normalize

protected boolean normalize

data

protected PointList data

clusters

protected KMeansClustering.Cluster[] clusters

covariances

protected Matrix[] covariances

existsClustering

protected boolean existsClustering
Constructor Detail

KMeansClustering

public KMeansClustering(int numberClusters,
                        PointList pl)
                 throws IllegalArgumentException
Constructs an object that allows to separate a point list into k clusters. By default maximal number of iterations is 20 and normalization is turned off.

Parameters:
numberClusters - int number of clusters to separate the points into
pl - PointList a list of points in the n-dimensional space
Throws:
IllegalArgumentException - raised if method contract is violated

KMeansClustering

public KMeansClustering(int numberClusters,
                        PointList pl,
                        boolean normalize)
                 throws IllegalArgumentException
Constructs an object that allows to separate a point list into k clusters. By default maximal number of iterations is 20.

Parameters:
numberClusters - int number of clusters to separate the points into
pl - PointList a list of points in the n-dimensional space
normalize - boolean true if data-points should be normalized before processing, otherwise false;
Throws:
IllegalArgumentException - raised if method contract is violated

KMeansClustering

public KMeansClustering(int numberClusters,
                        PointList pl,
                        boolean normalize,
                        int maxIterations)
                 throws IllegalArgumentException
Constructs an object that allows to separate a point list into k clusters. By default maximal number of iterations is 20.

Parameters:
numberClusters - int number of clusters to separate the points into
pl - PointList a list of points in the n-dimensional space
normalize - boolean true if data-points should be normalized before processing, otherwise false;
maxIterations - int maximal number of iterations the algorithm should perform
Throws:
IllegalArgumentException - raised if method contract is violated
Method Detail

run

public void run()
Starts the k-mean clustering algorithm. The cluster centers are recomputed till the mean quantization error(MQE) stops decreasing.


createFullCovarianceMatrices

protected void createFullCovarianceMatrices()
Creats the full covariance matrices.


getNumberClusters

public int getNumberClusters()
Returns the number of clusters to separate the given data points into.

Returns:
int number of clusters

getMean

public Matrix getMean(int cluster)
Returns the mean of the specified cluster.

Note: Before calling this method there should exist a valid clustering, which one can create by calling the run() method.

Parameters:
cluster - int the number of cluster to get the mean of
Returns:
Matrix the mean vector of the cluster

getMeans

public Matrix[] getMeans()
Returns the mean vectors of all clusters in one array.

Note: Before calling this method there should exist a valid clustering, which one can create by calling the run() method.

Returns:
Matrix[] array containing the mean vectors

getFullCovarianceMatrix

public Matrix getFullCovarianceMatrix(int cluster)
Returns the full covariance matrix of the specified cluster.

Note: Before calling this method there should exist a valid clustering, which one can create by calling the run() method.

Parameters:
cluster - int the number of the cluster to get the covariance matrix of
Returns:
Matrix the covariance matrix of the cluster

getFullCovariances

public Matrix[] getFullCovariances()
Returns the full covaraince matrices of all clusters in one array.

Note: Before calling this method there should exist a valid clustering, which one can create by calling the run() method.

Returns:
Matrix[] array containing the covaraince matrices

getDiagCovarianceMatrix

public Matrix getDiagCovarianceMatrix(int cluster)
Returns the diagonal covariance matrix of the specified cluster.

Note: Before calling this method there should exist a valid clustering, which one can create by calling the run() method.

Parameters:
cluster - int the number of the cluster to get the covariance matrix of
Returns:
Matrix the covariance matrix of the cluster

getDiagCovariances

public Matrix[] getDiagCovariances()
Returns the diagonal covaraince matrices of all clusters in one array.

Note: Before calling this method there should exist a valid clustering, which one can create by calling the run() method.

Returns:
Matrix[] array containing the covaraince matrices

getClusterWeight

public double getClusterWeight(int cluster)
Returns the weight of the specified cluster.

Note: Before calling this method there should exist a valid clustering, which one can create by calling the run() method.

Parameters:
cluster - int the number of cluster to get the weight of
Returns:
double the weight of this cluster

getClusterWeights

public double[] getClusterWeights()
Returns the weights of all clusters in one array.

Note: Before calling this method there should exist a valid clustering, which one can create by calling the run() method.

Returns:
double[] array containing the weights

print

public void print()
For debugging purpose only.