composeml.LabelTimes.bin

LabelTimes.bin(bins, quantiles=False, labels=None, right=True, precision=3)[source]

Bin labels into discrete intervals.

Parameters:
  • bins (int or array) – The criteria to bin by. As an integer, the value can be the number of equal-width or quantile-based bins. If quantiles is False, the value is defined as the number of equal-width bins. The range is extended by .1% on each side to include the minimum and maximum values. If quantiles is True, the value is defined as the number of quantiles (e.g. 10 for deciles, 4 for quartiles, etc.) As an array, the value can be custom or quantile-based edges. If quantiles is False, the value is defined as bin edges allowing for non-uniform width. No extension is done. If quantiles is True, the value is defined as bin edges usings an array of quantiles (e.g. [0, .25, .5, .75, 1.] for quartiles)
  • quantiles (bool) – Determines whether to use a quantile-based discretization function.
  • labels (array) – Specifies the labels for the returned bins. Must be the same length as the resulting bins.
  • right (bool) – Indicates whether bins includes the rightmost edge or not. Does not apply to quantile-based bins.
  • precision (int) – The precision at which to store and display the bins labels. Default value is 3.
Returns:

Instance of labels.

Return type:

LabelTimes

Examples

These are the target values for the examples.

>>> data = [226.93, 47.95, 283.46, 31.54]
>>> lt = LabelTimes({'target': data})
>>> lt
   target
0  226.93
1   47.95
2  283.46
3   31.54

Bin values using equal-widths.

>>> lt.bin(2)
            target
0  (157.5, 283.46]
1  (31.288, 157.5]
2  (157.5, 283.46]
3  (31.288, 157.5]

Bin values using custom-widths.

>>> lt.bin([0, 200, 400])
       target
0  (200, 400]
1    (0, 200]
2  (200, 400]
3    (0, 200]

Bin values using infinite edges.

>>> lt.bin(['-inf', 100, 'inf'])
          target
0   (100.0, inf]
1  (-inf, 100.0]
2   (100.0, inf]
3  (-inf, 100.0]

Bin values using quartiles.

>>> lt.bin(4, quantiles=True)
                         target
0             (137.44, 241.062]
1              (43.848, 137.44]
2             (241.062, 283.46]
3  (31.538999999999998, 43.848]

Bin values using custom quantiles with precision.

>>> lt.bin([0, .5, 1], quantiles=True, precision=1)
           target
0  (137.4, 283.5]
1   (31.4, 137.4]
2  (137.4, 283.5]
3   (31.4, 137.4]

Assign labels to bins.

>>> lt.bin(2, labels=['low', 'high'])
  target
0   high
1    low
2   high
3    low