bclust                 package:e1071                 R Documentation

_B_a_g_g_e_d _C_l_u_s_t_e_r_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     Cluster the data in 'x' using the bagged clustering algorithm. A
     partitioning cluster algorithm such as 'kmeans' is run repeatedly
     on bootstrap samples from the original data. The resulting cluster
     centers are then combined using the hierarchical cluster algorithm
     'hclust'.

_U_s_a_g_e:

     bclust(x, centers=2, iter.base=10, minsize=0,
            dist.method="euclidian",
            hclust.method="average", base.method="kmeans",
            base.centers=20, verbose=TRUE,
            final.kmeans=FALSE, docmdscale=FALSE,
            resample=TRUE, weights=NULL, maxcluster=base.centers, ...)
     hclust.bclust(object, x, centers, dist.method=object$dist.method,
                   hclust.method=object$hclust.method, final.kmeans=FALSE,
                   docmdscale = FALSE, maxcluster=object$maxcluster)
     ## S3 method for class 'bclust':
     plot(x, maxcluster=x$maxcluster, main, ...)
     centers.bclust(object, k)
     clusters.bclust(object, k, x=NULL)

_A_r_g_u_m_e_n_t_s:

       x: Matrix of inputs (or object of class '"bclust"' for plot).

centers, k: Number of clusters.

iter.base: Number of runs of the base cluster algorithm.

 minsize: Minimum number of points in a base cluster.

dist.method: Distance method used for the hierarchical clustering, see
          'dist' for available distances.

hclust.method: Linkage method used for the hierarchical clustering, see
          'hclust' for available methods.

base.method: Partitioning cluster method used as base algorithm.

base.centers: Number of centers used in each repetition of the base
          method.

 verbose: Output status messages.

final.kmeans: If 'TRUE', a final kmeans step is performed using the
          output of the bagged clustering as initialization.

docmdscale: Logical, if 'TRUE' a 'cmdscale' result is included in the
          return value.

resample: Logical, if 'TRUE' the base method is run on bootstrap
          samples of 'x', else directly on 'x'.

 weights: Vector of length 'nrow(x)', weights for the resampling. By
          default all observations have equal weight.

maxcluster: Maximum number of clusters memberships are to be computed
          for.

  object: Object of class '"bclust"'.

    main: Main title of the plot.

     ...: Optional arguments top be passed to the base method in
          'bclust', ignored in 'plot'.

_D_e_t_a_i_l_s:

     First, 'iter.base' bootstrap samples of the original data in 'x'
     are created by drawing with replacement. The base cluster method
     is run on each of these samples with 'base.centers' centers. The
     'base.method' must be the name of a partitioning cluster function
     returning a list with the same components as the return value of
     'kmeans'.

     This results in a collection of 'iter.base * base.centers'
     centers, which are subsequently clustered using the hierarchical
     method 'hclust'. Base centers with less than 'minsize' points in
     there respective partitions are removed before the hierarchical
     clustering.

     The resulting dendrogram is then cut to produce 'centers'
     clusters. Hence, the name of the argument 'centers' is a little
     bit misleading as the resulting clusters need not be convex, e.g.,
     when single linkage is used. The name was chosen for compatibility
      with standard partitioning cluster methods such as 'kmeans'.

     A new hierarchical clustering (e.g., using another
     'hclust.method') re-using previous base runs can be performed by
     running 'hclust.bclust' on the return value of 'bclust'.

_V_a_l_u_e:

     'bclust' and 'hclust.bclust' return objects of class '"bclust"'
     including the components  

  hclust: Return value of the hierarchical clustering of the collection
          of base centers (Object of class '"hclust"').

 cluster: Vector with indices of the clusters the inputs are assigned
          to.

 centers: Matrix of centers of the final clusters. Only useful, if the
          hierarchical clustering method produces convex clusters.

allcenters: Matrix of all 'iter.base * base.centers' centers found in
          the base runs.

_A_u_t_h_o_r(_s):

     Friedrich Leisch

_R_e_f_e_r_e_n_c_e_s:

     Friedrich Leisch. Bagged clustering. Working Paper 51, SFB
     ``Adaptive Information Systems and Modeling in Economics and
     Management Science'', August 1999. <URL:
     http://www.ci.tuwien.ac.at/~leisch>

_S_e_e _A_l_s_o:

     'hclust', 'kmeans', 'boxplot.bclust'

_E_x_a_m_p_l_e_s:

     data(iris)
     bc1 <- bclust(iris[,1:4], 3, base.centers=5)
     plot(bc1)

     table(clusters.bclust(bc1, 3))
     centers.bclust(bc1, 3)

