KernelDensity#

class pyspark.mllib.stat.KernelDensity[source]#

Estimate probability density at required points given an RDD of samples from the population.

Examples

>>> kd = KernelDensity()
>>> sample = sc.parallelize([0.0, 1.0])
>>> kd.setSample(sample)
>>> kd.estimate([0.0, 1.0])
array([ 0.12938758,  0.12938758])

Methods

estimate(points)

Estimate the probability density at points

setBandwidth(bandwidth)

Set bandwidth of each sample.

setSample(sample)

Set sample points from the population.

Methods Documentation

estimate(points)[source]#

Estimate the probability density at points

setBandwidth(bandwidth)[source]#

Set bandwidth of each sample. Defaults to 1.0

setSample(sample)[source]#

Set sample points from the population. Should be a RDD