class sklearn.preprocessing.MultiLabelBinarizer(classes=None, sparse_output=False) [source]
Transform between iterable of iterables and a multilabel format
Although a list of sets or tuples is a very intuitive format for multilabel data, it is unwieldy to process. This transformer converts between this intuitive format and the supported multilabel format: a (samples x classes) binary matrix indicating the presence of a class label.
| Parameters: |
classes : array-like of shape [n_classes] (optional) Indicates an ordering for the class labels sparse_output : boolean (default: False), Set to true if output binary array is desired in CSR sparse format |
|---|---|
| Attributes: |
classes_ : array of labels A copy of the |
See also
sklearn.preprocessing.OneHotEncoder
>>> from sklearn.preprocessing import MultiLabelBinarizer
>>> mlb = MultiLabelBinarizer()
>>> mlb.fit_transform([(1, 2), (3,)])
array([[1, 1, 0],
[0, 0, 1]])
>>> mlb.classes_
array([1, 2, 3])
>>> mlb.fit_transform([set(['sci-fi', 'thriller']), set(['comedy'])])
array([[0, 1, 1],
[1, 0, 0]])
>>> list(mlb.classes_)
['comedy', 'sci-fi', 'thriller']
fit(y) | Fit the label sets binarizer, storing classes_
|
fit_transform(y) | Fit the label sets binarizer and transform the given label sets |
get_params([deep]) | Get parameters for this estimator. |
inverse_transform(yt) | Transform the given indicator matrix into label sets |
set_params(**params) | Set the parameters of this estimator. |
transform(y) | Transform the given label sets |
__init__(classes=None, sparse_output=False) [source]
fit(y) [source]
Fit the label sets binarizer, storing classes_
| Parameters: |
y : iterable of iterables A set of labels (any orderable and hashable object) for each sample. If the |
|---|---|
| Returns: |
self : returns this MultiLabelBinarizer instance |
fit_transform(y) [source]
Fit the label sets binarizer and transform the given label sets
| Parameters: |
y : iterable of iterables A set of labels (any orderable and hashable object) for each sample. If the |
|---|---|
| Returns: |
y_indicator : array or CSR matrix, shape (n_samples, n_classes) A matrix such that |
get_params(deep=True) [source]
Get parameters for this estimator.
| Parameters: |
deep : boolean, optional If True, will return the parameters for this estimator and contained subobjects that are estimators. |
|---|---|
| Returns: |
params : mapping of string to any Parameter names mapped to their values. |
inverse_transform(yt) [source]
Transform the given indicator matrix into label sets
| Parameters: |
yt : array or sparse matrix of shape (n_samples, n_classes) A matrix containing only 1s ands 0s. |
|---|---|
| Returns: |
y : list of tuples The set of labels for each sample such that |
set_params(**params) [source]
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
| Returns: | self : |
|---|
transform(y) [source]
Transform the given label sets
| Parameters: |
y : iterable of iterables A set of labels (any orderable and hashable object) for each sample. If the |
|---|---|
| Returns: |
y_indicator : array or CSR matrix, shape (n_samples, n_classes) A matrix such that |
© 2007–2017 The scikit-learn developers
Licensed under the 3-clause BSD License.
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MultiLabelBinarizer.html