composeml.LabelTimes.bin

LabelTimes.bin(bins, quantiles=False, labels=None, right=True, precision=3)[source]

Bin labels into discrete intervals.

Parameters
  • bins (int or array) – The criteria to bin by. As an integer, the value can be the number of equal-width or quantile-based bins. If quantiles is False, the value is defined as the number of equal-width bins. The range is extended by .1% on each side to include the minimum and maximum values. If quantiles is True, the value is defined as the number of quantiles (e.g. 10 for deciles, 4 for quartiles, etc.) As an array, the value can be custom or quantile-based edges. If quantiles is False, the value is defined as bin edges allowing for non-uniform width. No extension is done. If quantiles is True, the value is defined as bin edges usings an array of quantiles (e.g. [0, .25, .5, .75, 1.] for quartiles)

  • quantiles (bool) – Determines whether to use a quantile-based discretization function.

  • labels (array) – Specifies the labels for the returned bins. Must be the same length as the resulting bins.

  • right (bool) – Indicates whether bins includes the rightmost edge or not. Does not apply to quantile-based bins.

  • precision (int) – The precision at which to store and display the bins labels. Default value is 3.

Returns

Instance of labels.

Return type

LabelTimes

Examples

These are the target values for the examples.

>>> data = [226.93, 47.95, 283.46, 31.54]
>>> lt = LabelTimes({'target': data})
>>> lt
   target
0  226.93
1   47.95
2  283.46
3   31.54

Bin values using equal-widths.

>>> lt.bin(2)
            target
0  (157.5, 283.46]
1  (31.288, 157.5]
2  (157.5, 283.46]
3  (31.288, 157.5]

Bin values using custom-widths.

>>> lt.bin([0, 200, 400])
       target
0  (200, 400]
1    (0, 200]
2  (200, 400]
3    (0, 200]

Bin values using infinite edges.

>>> lt.bin(['-inf', 100, 'inf'])
          target
0   (100.0, inf]
1  (-inf, 100.0]
2   (100.0, inf]
3  (-inf, 100.0]

Bin values using quartiles.

>>> lt.bin(4, quantiles=True)
                         target
0             (137.44, 241.062]
1              (43.848, 137.44]
2             (241.062, 283.46]
3  (31.538999999999998, 43.848]

Bin values using custom quantiles with precision.

>>> lt.bin([0, .5, 1], quantiles=True, precision=1)
           target
0  (137.4, 283.5]
1   (31.4, 137.4]
2  (137.4, 283.5]
3   (31.4, 137.4]

Assign labels to bins.

>>> lt.bin(2, labels=['low', 'high'])
  target
0   high
1    low
2   high
3    low