svm                  package:e1071                  R Documentation

_S_u_p_p_o_r_t _V_e_c_t_o_r _M_a_c_h_i_n_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     'svm' is used to train a support vector machine. It can be used to
     carry out general regression and classification (of nu and
     epsilon-type), as well as density-estimation. A formula interface
     is provided.

_U_s_a_g_e:

     ## S3 method for class 'formula':
     svm(formula, data = NULL, ..., subset, na.action =
     na.omit, scale = TRUE)
     ## Default S3 method:
     svm(x, y = NULL, scale = TRUE, type = NULL, kernel =
     "radial", degree = 3, gamma = 1 / ncol(as.matrix(x)), coef0 = 0, cost = 1, nu = 0.5,
     class.weights = NULL, cachesize = 40, tolerance = 0.001, epsilon = 0.1,
     shrinking = TRUE, cross = 0, probability = FALSE, fitted = TRUE, 
     ..., subset, na.action = na.omit)

_A_r_g_u_m_e_n_t_s:

 formula: a symbolic description of the model to be fit.

    data: an optional data frame containing the variables in the model.
          By default the variables are taken from the environment which
          'svm' is called from.

       x: a data matrix, a vector, or a sparse matrix (object of class
          'matrix.csr' as provided by the package 'SparseM').

       y: a response vector with one label for each row/component of
          'x'. Can be either a factor (for classification tasks) or a
          numeric vector (for regression).

   scale: A logical vector indicating the variables to be scaled. If
          'scale' is of length 1, the value is recycled as many times
          as needed. Per default, data are scaled internally (both 'x'
          and 'y' variables) to zero mean and unit variance. The center
          and scale values are returned and used for later predictions.

    type: 'svm' can be used as a classification machine, as a regresson
          machine, or for novelty detection. Depending of whether 'y'
          is a factor or not, the default setting for 'type' is
          'C-classification' or 'eps-regression', respectively, but may
          be overwritten by setting an explicit value.
           Valid options are:

             *  'C-classification'

             *  'nu-classification'

             *  'one-classification' (for novelty detection)

             *  'eps-regression'

             *  'nu-regression'

  kernel: the kernel used in training and predicting. You might
          consider changing some of the following parameters, depending
          on the kernel type.

          _l_i_n_e_a_r: u'*v

          _p_o_l_y_n_o_m_i_a_l: (gamma*u'*v + coef0)^degree

          _r_a_d_i_a_l _b_a_s_i_s: exp(-gamma*|u-v|^2)

          _s_i_g_m_o_i_d: tanh(gamma*u'*v + coef0)

  degree: parameter needed for kernel of type 'polynomial' (default: 3)

   gamma: parameter needed for all kernels except 'linear' (default:
          1/(data dimension))

   coef0: parameter needed for kernels of type 'polynomial' and
          'sigmoid' (default: 0)

    cost: cost of constraints violation (default: 1)-it is the
          'C'-constant of the regularization term in the Lagrange
          formulation.

      nu: parameter needed for 'nu-classification', 'nu-regression',
          and 'one-classification'

class.weights: a named vector of weights for the different classes,
          used for asymetric class sizes. Not all factor levels have to
          be supplied (default weight: 1). All components have to be
          named.

cachesize: cache memory in MB (default 40)

tolerance: tolerance of termination criterion (default: 0.001)

 epsilon: epsilon in the insensitive-loss function (default: 0.1)

shrinking: option whether to use the shrinking-heuristics (default:
          'TRUE')

   cross: if a integer value k>0 is specified, a k-fold cross
          validation on the training data is performed to assess the
          quality of the model: the accuracy rate for classification
          and the Mean Sqared Error for regression

  fitted: logical indicating whether the fitted values should be
          computed and included in the model or not (default: 'TRUE')

probability: logical indicating whether the model should allow for
          probability predictions.

     ...: additional parameters for the low level fitting function
          'svm.default'

  subset: An index vector specifying the cases to be used in the
          training sample.  (NOTE: If given, this argument must be
          named.)

na.action: A function to specify the action to be taken if 'NA's are
          found. The default action is 'na.omit', which leads to
          rejection of cases with missing values on any required
          variable. An alternative is 'na.fail', which causes an error
          if 'NA' cases are found. (NOTE: If given, this argument must
          be named.)

_D_e_t_a_i_l_s:

     For multiclass-classification with k levels, k>2, 'libsvm' uses
     the 'one-against-one'-approach, in which k(k-1)/2 binary
     classifiers are trained; the appropriate class is found by a
     voting scheme.
      'libsvm' internally uses a sparse data representation, which is 
     also high-level supported by the package 'SparseM'.
      If the predictor variables include factors, the formula interface
     must be used to get a correct model matrix.
      'plot.svm' allows a simple graphical visualization of
     classification models.
      The probability model for classification fits a logistic
     distribution using maximum likelihood to the decision values of
     all binary classifiers, and computes the a-posteriori class
     probabilities for the multi-class problem using quadratic
     optimization. The probabilistic regression model assumes
     (zero-mean) laplace-distributed errors for the predictions, and
     estimates the scale parameter using maximum likelihood.

_V_a_l_u_e:

     An object of class '"svm"' containing the fitted model, including: 

      SV: The resulting support vectors (possibly scaled).

   index: The index of the resulting support vectors in the data
          matrix. Note that this index refers to the preprocessed data
          (after the possible effect of 'na.omit' and 'subset')

   coefs: The corresponding coefficients times the training labels.

     rho: The negative intercept.

   sigma: In case of a probabilistic regression model, the scale
          parameter of the hypothesized (zero-mean) laplace
          distribution estimated by maximum likelihood.

probA, probB: numeric vectors of length k(k-1)/2, k number of classes,
          containing the parameters of the logistic distributions
          fitted to the decision values of the binary classifers (1 /
          (1 + exp(a x + b))).

_N_o_t_e:

     Data are scaled internally, usually yielding better results.

_A_u_t_h_o_r(_s):

     David Meyer (based on C/C++-code by Chih-Chung Chang and Chih-Jen
     Lin)
      david.meyer@ci.tuwien.ac.at

_R_e_f_e_r_e_n_c_e_s:

        *  Chang, Chih-Chung and Lin, Chih-Jen:
            _LIBSVM: a library for Support Vector Machines_
            <URL: http://www.csie.ntu.edu.tw/~cjlin/libsvm>

        *  Exact formulations of models, algorithms, etc. can be found
           in the document:
            Chang, Chih-Chung and Lin, Chih-Jen:
            _LIBSVM: a library for Support Vector Machines_
            <URL:
           http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.ps.gz>

        *  Chang, Chih-Chung and Lin, Chih-Jen:
            _Libsvm: Introduction and Benchmarks_
            <URL: http://www.csie.ntu.edu.tw/~cjlin/papers/q2.ps.gz>

_S_e_e _A_l_s_o:

     'predict.svm' 'plot.svm' 'matrix.csr' (in package 'SparseM')

_E_x_a_m_p_l_e_s:

     data(iris)
     attach(iris)

     ## classification mode
     # default with factor response:
     model <- svm(Species ~ ., data = iris)

     # alternatively the traditional interface:
     x <- subset(iris, select = -Species)
     y <- Species
     model <- svm(x, y) 

     print(model)
     summary(model)

     # test with train data
     pred <- predict(model, x)
     # (same as:)
     pred <- fitted(model)

     # Check accuracy:
     table(pred, y)

     # compute decision values and probabilities:
     pred <- predict(model, x, decision.values = TRUE)
     attr(pred, "decision.values")[1:4,]

     # visualize (classes by color, SV by crosses):
     plot(cmdscale(dist(iris[,-5])),
          col = as.integer(iris[,5]),
          pch = c("o","+")[1:150 %in% model$index + 1])

     ## try regression mode on two dimensions

     # create data
     x <- seq(0.1, 5, by = 0.05)
     y <- log(x) + rnorm(x, sd = 0.2)

     # estimate model and predict input values
     m   <- svm(x, y)
     new <- predict(m, x)

     # visualize
     plot(x, y)
     points(x, log(x), col = 2)
     points(x, new, col = 4)

     ## density-estimation

     # create 2-dim. normal with rho=0:
     X <- data.frame(a = rnorm(1000), b = rnorm(1000))
     attach(X)

     # traditional way:
     m <- svm(X, gamma = 0.1)

     # formula interface:
     m <- svm(~., data = X, gamma = 0.1)
     # or:
     m <- svm(~ a + b, gamma = 0.1)

     # test:
     newdata <- data.frame(a = c(0, 4), b = c(0, 4))
     predict (m, newdata)

     # visualize:
     plot(X, col = 1:1000 %in% m$index + 1, xlim = c(-5,5), ylim=c(-5,5))
     points(newdata, pch = "+", col = 2, cex = 5)

     # weights: (example not particularly sensible)
     i2 <- iris
     levels(i2$Species)[3] <- "versicolor"
     summary(i2$Species)
     wts <- 100 / table(i2$Species)
     wts
     m <- svm(Species ~ ., data = i2, class.weights = wts)

