org.apache.spark.mllib.feature
RDD of LabeledPoint
Run the entropy minimization discretizer on input data.
Run the entropy minimization discretizer on input data.
Indices to discretize (if not specified, the algorithm try to figure it out).
Maximum number of elements to keep in each partition.
Maximum number of thresholds per feature.
A discretization model with the thresholds by feature.
Entropy minimization discretizer based on Minimum Description Length Principle (MDLP) proposed by Fayyad and Irani in 1993 [1].
[1] Fayyad, U., & Irani, K. (1993). "Multi-interval discretization of continuous-valued attributes for classification learning."