Golearn includes liblinear, used for logistic regression and also linear support vector classification. It's best suited to datasets containing only a large number of numeric attributes, which often occur in natural language processing.
LinearSVC
- The LinearSVC classifier outputs a binary class value (either 0 or 1).
- Only
FloatAttributes
are used as input.
- Only one class Attribute is supported.
- Training requires conversion of the dataset, which may cause memory presure (see issue #94).
- Prediction requires conversion of only the current row.
- Penalty and loss parameters can be either "l1" or "l2". Not all combinations are supported.
- The dual parameter decides whether liblinear optimises the primal or dual form, some choices are incompatible with combinations of "l1" and "l2".
- C is roughly the "penalty" parameter.
- eps decides when to stop iterating. Smaller values typically take longer.
MultiLinearSVC
The MultiLinearSVC can output a categorical class value, and uses the OneVsAllModel meta-classifier to output any CategoricalAttribute value. It works by training n binary LinearSVC classifiers - one for each given class - and classifying a instance as a given class when one of the underlying LinearSVC classifiers reports 1. Parameters and other things are precisely the same as the LinearSVC.
Limitatations
- Currently, per-class weights are unsupported.
- Only
FloatAttributes
are currently supported.