scipy.stats.rv_discrete

class scipy.stats.rv_discrete(a=0, b=inf, name=None, badvalue=None, moment_tol=1e-08, values=None, inc=1, longname=None, shapes=None, extradoc=None)

A generic discrete random variable class meant for subclassing.

rv_discrete is a base class to construct specific distribution classes and instances from for discrete random variables. rv_discrete can be used to construct an arbitrary distribution with defined by a list of support points and the corresponding probabilities.

Parameters:
  • a (float, optional) -- Lower bound of the support of the distribution, default: 0
  • b (float, optional) -- Upper bound of the support of the distribution, default: plus infinity
  • moment_tol (float, optional) -- The tolerance for the generic calculation of moments
  • values (tuple of two array_like) -- (xk, pk) where xk are points (integers) with positive probability pk with sum(pk) = 1
  • inc (integer) -- increment for the support of the distribution, default: 1 other values have not been tested
  • badvalue (object, optional) -- The value in (masked) arrays that indicates a value that should be ignored.
  • name (str, optional) -- The name of the instance. This string is used to construct the default example for distributions.
  • longname (str, optional) -- This string is used as part of the first line of the docstring returned when a subclass has no docstring of its own. Note: longname exists for backwards compatibility, do not use for new subclasses.
  • shapes (str, optional) -- The shape of the distribution. For example "m, n" for a distribution that takes two integers as the first two arguments for all its methods.
  • extradoc (str, optional) -- This string is used as the last part of the docstring returned when a subclass has no docstring of its own. Note: extradoc exists for backwards compatibility, do not use for new subclasses.
generic.rvs(<shape(s)>, loc=0, size=1)

random variates

generic.pmf(x, <shape(s)>, loc=0)

probability mass function

logpmf(x, <shape(s)>, loc=0)

log of the probability density function

generic.cdf(x, <shape(s)>, loc=0)

cumulative density function

generic.logcdf(x, <shape(s)>, loc=0)

log of the cumulative density function

generic.sf(x, <shape(s)>, loc=0)

survival function (1-cdf --- sometimes more accurate)

generic.logsf(x, <shape(s)>, loc=0, scale=1)

log of the survival function

generic.ppf(q, <shape(s)>, loc=0)

percent point function (inverse of cdf --- percentiles)

generic.isf(q, <shape(s)>, loc=0)

inverse survival function (inverse of sf)

generic.moment(n, <shape(s)>, loc=0)

non-central n-th moment of the distribution. May not work for array arguments.

generic.stats(<shape(s)>, loc=0, moments='mv')

mean('m', axis=0), variance('v'), skew('s'), and/or kurtosis('k')

generic.entropy(<shape(s)>, loc=0)

entropy of the RV

generic.expect(func=None, args=(), loc=0, lb=None, ub=None,
conditional=False)

Expected value of a function with respect to the distribution. Additional kwd arguments passed to integrate.quad

generic.median(<shape(s)>, loc=0)

Median of the distribution.

generic.mean(<shape(s)>, loc=0)

Mean of the distribution.

generic.std(<shape(s)>, loc=0)

Standard deviation of the distribution.

generic.var(<shape(s)>, loc=0)

Variance of the distribution.

generic.interval(alpha, <shape(s)>, loc=0)

Interval that with alpha percent probability contains a random realization of this distribution.

generic(<shape(s)>, loc=0)

calling a distribution instance returns a frozen distribution

Notes

You can construct an arbitrary discrete rv where P{X=xk} = pk by passing to the rv_discrete initialization method (through the values=keyword) a tuple of sequences (xk, pk) which describes only those values of X (xk) that occur with nonzero probability (pk).

To create a new discrete distribution, we would do the following:

class poisson_gen(rv_discrete):
    #"Poisson distribution"
    def _pmf(self, k, mu):
        ...

and create an instance:

poisson = poisson_gen(name="poisson",
                      longname='A Poisson')

The docstring can be created from a template.

Alternatively, the object may be called (as a function) to fix the shape and location parameters returning a "frozen" discrete RV object:

myrv = generic(<shape(s)>, loc=0)
    - frozen RV object with the same methods but holding the given
      shape and location fixed.

A note on shapes: subclasses need not specify them explicitly. In this case, the shapes will be automatically deduced from the signatures of the overridden methods. If, for some reason, you prefer to avoid relying on introspection, you can specify shapes explicitly as an argument to the instance constructor.

Examples

Custom made discrete distribution:

>>> import matplotlib.pyplot as plt
>>> from scipy import stats
>>> xk = np.arange(7)
>>> pk = (0.1, 0.2, 0.3, 0.1, 0.1, 0.1, 0.1)
>>> custm = stats.rv_discrete(name='custm', values=(xk, pk))
>>> h = plt.plot(xk, custm.pmf(xk))

Random number generation:

>>> R = custm.rvs(size=100)

Display frozen pmf:

>>> numargs = generic.numargs
>>> [ <shape(s)> ] = ['Replace with resonable value', ]*numargs
>>> rv = generic(<shape(s)>)
>>> x = np.arange(0, np.min(rv.dist.b, 3)+1)
>>> h = plt.plot(x, rv.pmf(x))

Here, rv.dist.b is the right endpoint of the support of rv.dist.

Check accuracy of cdf and ppf:

>>> prb = generic.cdf(x, <shape(s)>)
>>> h = plt.semilogy(np.abs(x-generic.ppf(prb, <shape(s)>))+1e-20)
__init__(a=0, b=inf, name=None, badvalue=None, moment_tol=1e-08, values=None, inc=1, longname=None, shapes=None, extradoc=None)

Methods

__init__([a, b, name, badvalue, moment_tol, ...])
cdf(k, *args, **kwds) Cumulative distribution function of the given RV.
entropy(*args, **kwds) Differential entropy of the RV.
expect([func, args, loc, lb, ub, conditional]) Calculate expected value of a function with respect to the distribution
freeze(*args, **kwds) Freeze the distribution for the given arguments.
interval(alpha, *args, **kwds) Confidence interval with equal areas around the median.
isf(q, *args, **kwds) Inverse survival function (1-sf) at q of the given RV.
logcdf(k, *args, **kwds) Log of the cumulative distribution function at k of the given RV
logpmf(k, *args, **kwds) Log of the probability mass function at k of the given RV.
logsf(k, *args, **kwds) Log of the survival function of the given RV.
mean(*args, **kwds) Mean of the distribution
median(*args, **kwds) Median of the distribution.
moment(n, *args, **kwds) n'th order non-central moment of distribution.
pmf(k, *args, **kwds) Probability mass function at k of the given RV.
ppf(q, *args, **kwds) Percent point function (inverse of cdf) at q of the given RV
rvs(*args, **kwargs) Random variates of given type.
sf(k, *args, **kwds) Survival function (1-cdf) at k of the given RV.
stats(*args, **kwds) Some statistics of the given RV
std(*args, **kwds) Standard deviation of the distribution.
var(*args, **kwds) Variance of the distribution