W3cubDocs

sklearn.preprocessing.MultiLabelBinarizer

class sklearn.preprocessing.MultiLabelBinarizer(classes=None, sparse_output=False) [source]

Transform between iterable of iterables and a multilabel format

Although a list of sets or tuples is a very intuitive format for multilabel data, it is unwieldy to process. This transformer converts between this intuitive format and the supported multilabel format: a (samples x classes) binary matrix indicating the presence of a class label.

Parameters:

Parameters:	classes : array-like of shape [n_classes] (optional) Indicates an ordering for the class labels sparse_output : boolean (default: False), Set to true if output binary array is desired in CSR sparse format
Attributes:	classes_ : array of labels A copy of the `classes` parameter where provided, or otherwise, the sorted set of classes found when fitting.

classes : array-like of shape [n_classes] (optional)

Indicates an ordering for the class labels

sparse_output : boolean (default: False),

Set to true if output binary array is desired in CSR sparse format

Attributes:

classes_ : array of labels

A copy of the classes parameter where provided, or otherwise, the sorted set of classes found when fitting.

Examples

>>> from sklearn.preprocessing import MultiLabelBinarizer
>>> mlb = MultiLabelBinarizer()
>>> mlb.fit_transform([(1, 2), (3,)])
array([[1, 1, 0],
       [0, 0, 1]])
>>> mlb.classes_
array([1, 2, 3])

>>> mlb.fit_transform([set(['sci-fi', 'thriller']), set(['comedy'])])
array([[0, 1, 1],
       [1, 0, 0]])
>>> list(mlb.classes_)
['comedy', 'sci-fi', 'thriller']

Methods

`fit`(y)	Fit the label sets binarizer, storing `classes_`
`fit_transform`(y)	Fit the label sets binarizer and transform the given label sets
`get_params`([deep])	Get parameters for this estimator.
`inverse_transform`(yt)	Transform the given indicator matrix into label sets
`set_params`(**params)	Set the parameters of this estimator.
`transform`(y)	Transform the given label sets

__init__(classes=None, sparse_output=False) [source]

fit(y) [source]

Fit the label sets binarizer, storing classes_

Parameters:

Parameters:	y : iterable of iterables A set of labels (any orderable and hashable object) for each sample. If the `classes` parameter is set, `y` will not be iterated.
Returns:	self : returns this MultiLabelBinarizer instance

y : iterable of iterables

A set of labels (any orderable and hashable object) for each sample. If the classes parameter is set, y will not be iterated.

Returns:

self : returns this MultiLabelBinarizer instance

fit_transform(y) [source]

Fit the label sets binarizer and transform the given label sets

Parameters:

Parameters:	y : iterable of iterables A set of labels (any orderable and hashable object) for each sample. If the `classes` parameter is set, `y` will not be iterated.
Returns:	y_indicator : array or CSR matrix, shape (n_samples, n_classes) A matrix such that `y_indicator[i, j] = 1` iff `classes_[j]` is in `y[i]`, and 0 otherwise.

y : iterable of iterables

A set of labels (any orderable and hashable object) for each sample. If the classes parameter is set, y will not be iterated.

Returns:

y_indicator : array or CSR matrix, shape (n_samples, n_classes)

A matrix such that y_indicator[i, j] = 1 iff classes_[j] is in y[i], and 0 otherwise.

get_params(deep=True) [source]

Get parameters for this estimator.

Parameters:

Parameters:	deep : boolean, optional If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:	params : mapping of string to any Parameter names mapped to their values.

deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params : mapping of string to any

Parameter names mapped to their values.

inverse_transform(yt) [source]

Transform the given indicator matrix into label sets

Parameters:

Parameters:	yt : array or sparse matrix of shape (n_samples, n_classes) A matrix containing only 1s ands 0s.
Returns:	y : list of tuples The set of labels for each sample such that `y[i]` consists of `classes_[j]` for each `yt[i, j] == 1`.

yt : array or sparse matrix of shape (n_samples, n_classes)

A matrix containing only 1s ands 0s.

Returns:

y : list of tuples

The set of labels for each sample such that y[i] consists of classes_[j] for each yt[i, j] == 1.

set_params(**params) [source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:	self :

transform(y) [source]

Transform the given label sets

Parameters:

Parameters:	y : iterable of iterables A set of labels (any orderable and hashable object) for each sample. If the `classes` parameter is set, `y` will not be iterated.
Returns:	y_indicator : array or CSR matrix, shape (n_samples, n_classes) A matrix such that `y_indicator[i, j] = 1` iff `classes_[j]` is in `y[i]`, and 0 otherwise.

y : iterable of iterables

A set of labels (any orderable and hashable object) for each sample. If the classes parameter is set, y will not be iterated.

Returns:

y_indicator : array or CSR matrix, shape (n_samples, n_classes)

A matrix such that y_indicator[i, j] = 1 iff classes_[j] is in y[i], and 0 otherwise.

© 2007–2017 The scikit-learn developers
Licensed under the 3-clause BSD License.
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MultiLabelBinarizer.html