histogram#

xarray_einstats.numba.histogram(da, dims, bins=None, density=False, **kwargs)[source]#

Numbify numpy.histogram to suport vectorized histograms.

Parameters:
daxarray.DataArray

Data to be binned.

dimsstr or list of str

Dimensions that should be reduced by binning.

binsarray_like, int or str, optional

Passed to numpy.histogram_bin_edges. If None (the default) histogram_bin_edges is called without arguments. Bin edges are shared by all generated histograms.

densitybool, optional

If False, the result will contain the number of samples in each bin. If True, the result is the value of the probability density function at the bin, normalized such that the integral over the range is 1. Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen; it is not a probability mass function.

kwargsdict, optional

Passed to xarray.apply_ufunc

Returns:
xarray.DataArray

Returns a DataArray with the histogram results. The dimensions provided in dims are reduced into a bin dimension (without coordinates). The bin edges are returned as coordinate values indexed by the dimension bin, left bin edges are stored as left_edges right ones as right_edges.

See also

xhistogram.xarray.histogram

Alternative implementation (with some different features) of xarray aware histogram.

Examples

Use histogram to compute multiple histograms in a vectorized fashion, binning along both chain and draw dimensions but not the match one. Consequently, histogram generates one independent histogram per match:

from xarray_einstats import tutorial, numba
ds = tutorial.generate_mcmc_like_dataset(3)
numba.histogram(ds["score"], dims=("chain", "draw"))
<xarray.DataArray 'score' (match: 12, bin: 10)>
13.0 0.0 17.0 0.0 7.0 0.0 2.0 0.0 1.0 ... 0.0 17.0 0.0 10.0 0.0 1.0 0.0 0.0 0.0
Coordinates:
    left_edges   (bin) float64 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
    right_edges  (bin) float64 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Dimensions without coordinates: match, bin

Note how the return is a single DataArray, not an array with the histogram and another with the bin edges. That is because the bin edges are included as coordinate values.