Getting started#

Welcome to `xarray-einstats`!#

xarray-einstats is an open source Python library part of the ArviZ project. It acts as a bridge between the xarray library for labelled arrays and libraries for raw arrays such as NumPy or SciPy.

Xarray has as “Compatibility with the broader ecosystem” as one of its main goals. Which is what allows xarray-einstats to perform this bridge role with minimal code and duplication.

Overview#

xarray-einstats provides wrappers for:

Most of the functions in numpy.linalg
A subset of scipy.stats
rearrange and reduce from einops

These wrappers have the same names and functionality as the original functions. The difference in behaviour is that the wrappers will not make assumptions about the meaning of a dimension based on its position nor they have arguments like axis or axes. They will have dims argument that take dimension names instead of integers indicating the positions of the dimensions on which to act.

It also provides a handful of re-implemented functions:

xarray_einstats.numba.histogram
xarray_einstats.stats.multivariate_normal

These are partially reimplemented because the original function doesn’t yet support multidimensional and/or batched computations. They also share the name with a function in NumPy or SciPy, but they only implement a subset of the features. Moreover, the goal is for those to eventually be wrappers too.

Using `xarray-einstats`#

DataArray inputs#

Functions in xarray-einstats are designed to work on DataArray objects.

Let’s load some example data:

from xarray_einstats import linalg, stats, tutorial

da = tutorial.generate_matrices_dataarray(4)
da

<xarray.DataArray (batch: 10, experiment: 3, dim: 4, dim2: 4)>
3.799 0.4308 3.24 0.1412 0.9402 0.7951 ... 0.6156 1.124 0.8559 2.108 0.7637
Dimensions without coordinates: batch, experiment, dim, dim2

xarray.DataArray

batch: 10
experiment: 3
dim: 4
dim2: 4

3.799 0.4308 3.24 0.1412 0.9402 ... 0.6156 1.124 0.8559 2.108 0.7637

array([[[[3.79882666e+00, 4.30798756e-01, 3.24019920e+00,
          1.41191728e-01],
         [9.40249752e-01, 7.95074363e-01, 6.14392738e-01,
          1.51788892e+00],
         [6.96485550e-01, 1.48638115e+00, 9.24488851e-01,
          9.36959028e-01],
         [4.51325873e-02, 1.49383286e+00, 4.66020360e+00,
          8.89301298e-01]],

        [[1.17281492e+00, 2.36145696e-01, 4.05060543e-01,
          2.73023556e+00],
         [3.85045860e+00, 6.96319885e-01, 1.05132646e-01,
          1.51893929e+00],
         [2.49489705e+00, 4.05162066e-01, 7.02605035e-01,
          7.44217154e-02],
         [2.05380556e-01, 4.39863704e-01, 2.57247186e-01,
          6.51837364e+00]],

        [[7.78130873e-02, 1.15333517e+00, 1.04575959e-01,
          1.09531541e+00],
...
         [8.02410206e-01, 2.53687788e+00, 4.92125404e-01,
          7.75512770e-01]],

        [[8.47740718e-01, 7.20906807e-01, 8.57066548e-01,
          3.70663454e-01],
         [1.09659421e-01, 1.99984213e+00, 8.11486165e-01,
          5.95943232e-01],
         [1.64674897e+00, 7.39885273e-01, 1.28045707e+00,
          9.52129119e-01],
         [8.33245966e-02, 5.09461653e-01, 2.46619096e-01,
          3.74406057e-01]],

        [[1.79307105e+00, 4.48275511e+00, 4.38669153e-01,
          2.11202758e-01],
         [5.72887190e-01, 5.44368187e-01, 7.42675779e-01,
          6.09577164e-01],
         [3.08581134e+00, 2.19692609e-01, 3.18979927e-01,
          6.15615880e-01],
         [1.12367098e+00, 8.55901946e-01, 2.10791528e+00,
          7.63657213e-01]]]])

Coordinates: (0)
Indexes: (0)
Attributes: (0)

and show an example:

stats.skew(da, dims=["batch", "dim2"])

<xarray.DataArray (experiment: 3, dim: 4)>
1.256 1.432 0.9728 1.762 1.612 1.188 1.033 2.388 2.196 1.455 1.631 1.373
Dimensions without coordinates: experiment, dim

xarray.DataArray

experiment: 3
dim: 4

1.256 1.432 0.9728 1.762 1.612 1.188 ... 2.388 2.196 1.455 1.631 1.373

array([[1.25606442, 1.43228343, 0.97277768, 1.76206659],
       [1.61150683, 1.18846091, 1.032563  , 2.38811974],
       [2.19586939, 1.45464315, 1.63106275, 1.37311222]])

Coordinates: (0)
Indexes: (0)
Attributes: (0)

xarray-einstats uses dims as argument throughout the codebase as an alternative to both axis or axes indistinctively, also as alternative to the (..., M, M) convention used by NumPy.

The use of dims follows dot, instead of the singular dim argument used for example in mean. Both a single dimension or multiple are valid inputs, and using dims emphasizes the fact that operations and reductions can be performed over multiple dimensions at the same time. Moreover, in linear algebra functions, dims is often restricted to a 2 element list as it indicates which dimensions define the matrices, interpreting all the others as batch dimensions.

That means that the two calls below are equivalent, even if the dimension names of the inputs are not, because their dimension names are the same. Thus,

linalg.det(da, dims=["dim", "dim2"])

<xarray.DataArray (batch: 10, experiment: 3)>
23.55 2.033 0.3923 -7.374 0.06645 ... 1.804 -0.1599 8.875 -0.04935 -8.428
Dimensions without coordinates: batch, experiment

xarray.DataArray

batch: 10
experiment: 3

23.55 2.033 0.3923 -7.374 0.06645 ... -0.1599 8.875 -0.04935 -8.428

array([[ 2.35505402e+01,  2.03341127e+00,  3.92334269e-01],
       [-7.37370790e+00,  6.64532395e-02, -7.29946955e-01],
       [-5.61718575e+00, -1.24880179e+01,  3.32142591e+00],
       [-1.22855082e+01,  9.28923133e-01,  4.12406682e-03],
       [ 1.76967820e+00, -1.34404726e-01, -3.94083819e-01],
       [-6.28923724e-01,  3.41250751e+00, -1.10672014e+01],
       [ 7.91097464e+00, -4.68198960e+00, -5.09578971e-01],
       [ 6.60204162e-01, -4.79811392e+00, -2.87677101e+01],
       [-4.22437266e+00,  1.80448134e+00, -1.59869422e-01],
       [ 8.87456124e+00, -4.93522957e-02, -8.42801282e+00]])

Coordinates: (0)
Indexes: (0)
Attributes: (0)

returns the same as:

linalg.det(da.transpose("dim2", "experiment", "dim", "batch"), dims=["dim", "dim2"])

<xarray.DataArray (experiment: 3, batch: 10)>
23.55 -7.374 -5.617 -12.29 1.77 -0.6289 ... -11.07 -0.5096 -28.77 -0.1599 -8.428
Dimensions without coordinates: experiment, batch

xarray.DataArray

experiment: 3
batch: 10

23.55 -7.374 -5.617 -12.29 1.77 ... -0.5096 -28.77 -0.1599 -8.428

array([[ 2.35505402e+01, -7.37370790e+00, -5.61718575e+00,
        -1.22855082e+01,  1.76967820e+00, -6.28923724e-01,
         7.91097464e+00,  6.60204162e-01, -4.22437266e+00,
         8.87456124e+00],
       [ 2.03341127e+00,  6.64532395e-02, -1.24880179e+01,
         9.28923133e-01, -1.34404726e-01,  3.41250751e+00,
        -4.68198960e+00, -4.79811392e+00,  1.80448134e+00,
        -4.93522957e-02],
       [ 3.92334269e-01, -7.29946955e-01,  3.32142591e+00,
         4.12406682e-03, -3.94083819e-01, -1.10672014e+01,
        -5.09578971e-01, -2.87677101e+01, -1.59869422e-01,
        -8.42801282e+00]])

Coordinates: (0)
Indexes: (0)
Attributes: (0)

Important

In xarray_einstats only the dimension names matter, not their order.

Dataset and GroupBy inputs#

While the DataArray is the base xarray object, there are also other xarray objects that are key while using the library. These other objects such as Dataset are implemented as a collection of DataArray objects, and all include a .map method in order to apply the same function to all its child DataArrays.

ds = tutorial.generate_mcmc_like_dataset(9438)
ds

<xarray.Dataset>
Dimensions:  (plot_dim: 20, chain: 4, draw: 10, team: 6, match: 12)
Coordinates:
  * team     (team) <U1 'a' 'b' 'c' 'd' 'e' 'f'
  * chain    (chain) int64 0 1 2 3
  * draw     (draw) int64 0 1 2 3 4 5 6 7 8 9
Dimensions without coordinates: plot_dim, match
Data variables:
    x_plot   (plot_dim) float64 0.0 0.5263 1.053 1.579 ... 8.947 9.474 10.0
    mu       (chain, draw, team) float64 0.2691 0.1617 0.4371 ... 0.4673 1.844
    sigma    (chain, draw) float64 1.939 1.435 0.5109 ... 0.594 1.54 1.257
    score    (chain, draw, match) int64 0 2 3 0 0 0 0 0 2 ... 1 0 1 1 1 2 4 0 2

xarray.Dataset

Dimensions:
- plot_dim: 20
- chain: 4
- draw: 10
- team: 6
- match: 12
Coordinates: (3)
- team
  (team)
  <U1
  'a' 'b' 'c' 'd' 'e' 'f'
```
array(['a', 'b', 'c', 'd', 'e', 'f'], dtype='<U1')
```
- chain
  (chain)
  int64
  0 1 2 3
```
array([0, 1, 2, 3])
```
- draw
  (draw)
  int64
  0 1 2 3 4 5 6 7 8 9
```
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
```

Data variables: (4)

x_plot

(plot_dim)

float64

0.0 0.5263 1.053 ... 9.474 10.0

array([ 0.        ,  0.52631579,  1.05263158,  1.57894737,  2.10526316,
        2.63157895,  3.15789474,  3.68421053,  4.21052632,  4.73684211,
        5.26315789,  5.78947368,  6.31578947,  6.84210526,  7.36842105,
        7.89473684,  8.42105263,  8.94736842,  9.47368421, 10.        ])

(chain, draw, team)

float64

0.2691 0.1617 ... 0.4673 1.844

array([[[2.69056877e-01, 1.61668482e-01, 4.37076245e-01, 4.88462056e-01,
         1.83607259e-01, 2.14929287e+00],
        [1.29624769e+00, 1.70485755e-01, 1.38613151e+00, 1.01921148e+00,
         2.89694957e-01, 2.48617359e-02],
        [6.73427333e-01, 9.46714172e-01, 7.36787238e-02, 1.09855063e+00,
         5.15092560e-01, 2.83864438e+00],
        [2.41990347e-01, 1.35565557e+00, 9.16673567e-02, 1.18105639e+00,
         2.79089061e-02, 1.94730217e-01],
        [4.52275630e-01, 9.12053418e-01, 2.52859833e-02, 4.33794112e-01,
         2.66568605e+00, 9.14395433e-02],
        [2.78496514e+00, 6.20305318e-01, 1.69758655e-01, 9.47409996e-02,
         1.68864061e-01, 5.51798341e-01],
        [1.69640367e+00, 5.08487198e-01, 5.62196221e-02, 6.68795534e-01,
         1.44256318e-01, 5.88747135e-01],
        [1.04914653e+00, 1.46754748e+00, 5.57365643e-01, 2.40870388e-01,
         1.83910661e+00, 8.59339439e-01],
        [2.42143301e-01, 1.89020741e-01, 1.02458611e-01, 4.16245545e-01,
         4.93556381e-01, 2.63486688e+00],
        [5.01173448e-01, 4.37304515e-01, 6.18764350e-01, 2.22065028e+00,
         1.46706875e+00, 3.22272355e-01]],
...
       [[1.07599614e+00, 3.69886350e-01, 2.72748601e-01, 2.43953936e+00,
         3.46216424e-01, 5.28820714e-01],
        [1.20773413e-01, 7.85535681e-01, 1.68790868e+00, 6.19088424e+00,
         2.63122594e-01, 4.79932328e+00],
        [1.87551311e+00, 8.86396999e-02, 2.12173963e+00, 4.63148585e-01,
         1.43172907e+00, 3.86750667e-01],
        [3.43250781e-02, 3.46138762e+00, 3.10064389e-01, 6.84710112e-01,
         2.57258901e-01, 2.04497762e-01],
        [6.76815757e-01, 1.96472446e+00, 2.51044085e-01, 3.99684270e-01,
         7.43170711e-01, 1.44297307e+00],
        [1.59314776e-01, 7.56079345e-01, 4.81973284e-01, 3.00770806e-01,
         2.90218388e-01, 2.88869462e+00],
        [5.34800924e-01, 1.08179942e+00, 1.81874956e+00, 1.09219518e+00,
         9.87210790e-02, 7.32145097e-01],
        [4.79654858e-01, 1.42189970e-01, 2.11525757e-01, 1.10643383e+00,
         1.97384347e+00, 7.09473650e-01],
        [1.00792902e+00, 5.78436486e-02, 4.70647603e+00, 1.59408169e-01,
         7.20011514e-01, 1.01756898e+00],
        [2.78376785e-01, 2.03673267e+00, 9.03159293e-02, 2.22115565e-01,
         4.67319362e-01, 1.84429285e+00]]])

sigma

(chain, draw)

float64

1.939 1.435 0.5109 ... 1.54 1.257

array([[1.93945051, 1.43500017, 0.5108609 , 0.03214607, 1.03125305,
        2.14627065, 0.11146201, 1.24609517, 0.53250872, 0.72483877],
       [1.00657671, 0.87033283, 0.85304953, 0.14678817, 1.5918367 ,
        2.20190038, 0.04020159, 0.53110128, 0.21825694, 0.10279373],
       [0.22022581, 0.20052445, 3.01274003, 0.3389516 , 0.69408389,
        0.66816212, 0.01630818, 1.18652866, 0.00627553, 0.90640338],
       [3.4599375 , 2.24078781, 2.95622862, 0.01828476, 3.17377189,
        0.06770274, 0.02285342, 0.5939785 , 1.53979199, 1.25728302]])

score

(chain, draw, match)

int64

0 2 3 0 0 0 0 0 ... 0 1 1 1 2 4 0 2

array([[[0, 2, 3, 0, 0, 0, 0, 0, 2, 0, 1, 0],
        [1, 1, 0, 1, 0, 0, 0, 2, 0, 1, 1, 4],
        [1, 1, 0, 1, 1, 0, 2, 1, 0, 1, 1, 1],
        [1, 0, 0, 0, 0, 2, 1, 0, 0, 1, 0, 1],
        [0, 0, 2, 1, 0, 1, 2, 0, 1, 0, 2, 0],
        [0, 0, 3, 0, 1, 0, 1, 1, 1, 3, 3, 2],
        [2, 0, 0, 1, 2, 2, 1, 2, 0, 1, 1, 0],
        [1, 0, 3, 0, 0, 0, 1, 0, 0, 4, 0, 0],
        [3, 2, 1, 1, 2, 1, 1, 2, 0, 0, 1, 2],
        [1, 0, 0, 0, 0, 1, 0, 2, 2, 1, 0, 2]],

       [[0, 0, 0, 0, 3, 1, 2, 1, 1, 1, 0, 1],
        [1, 1, 1, 0, 4, 2, 0, 1, 0, 1, 0, 1],
        [1, 0, 3, 0, 2, 4, 2, 1, 1, 1, 0, 0],
        [0, 0, 2, 0, 2, 1, 0, 1, 0, 0, 1, 1],
        [2, 0, 2, 0, 0, 2, 0, 0, 2, 0, 2, 0],
        [3, 0, 2, 2, 0, 0, 1, 0, 0, 0, 1, 1],
        [0, 0, 2, 1, 0, 1, 4, 2, 0, 0, 1, 2],
        [2, 2, 0, 0, 0, 2, 2, 1, 0, 0, 1, 1],
        [0, 0, 1, 0, 2, 0, 0, 0, 2, 1, 1, 1],
...
        [0, 0, 2, 2, 2, 1, 0, 0, 0, 3, 3, 0],
        [0, 0, 2, 2, 1, 0, 1, 2, 2, 1, 1, 0],
        [1, 2, 1, 1, 2, 0, 0, 0, 1, 1, 2, 0],
        [0, 3, 0, 0, 2, 0, 1, 1, 1, 1, 1, 3],
        [1, 4, 0, 0, 1, 2, 2, 0, 0, 1, 0, 0],
        [0, 1, 0, 2, 0, 0, 2, 0, 0, 0, 4, 1],
        [2, 0, 0, 1, 0, 1, 1, 1, 2, 2, 1, 0],
        [2, 1, 0, 2, 0, 1, 2, 1, 1, 1, 3, 0],
        [0, 0, 2, 0, 0, 1, 0, 3, 1, 0, 1, 3]],

       [[1, 2, 2, 1, 0, 0, 1, 1, 0, 0, 0, 0],
        [0, 0, 1, 0, 1, 1, 4, 0, 0, 0, 3, 2],
        [2, 0, 2, 0, 0, 0, 2, 0, 1, 1, 1, 2],
        [0, 4, 0, 0, 2, 2, 0, 1, 0, 1, 4, 0],
        [2, 1, 1, 0, 0, 0, 2, 0, 0, 3, 3, 0],
        [0, 0, 3, 0, 1, 0, 0, 1, 1, 0, 1, 3],
        [1, 1, 0, 1, 1, 2, 2, 1, 0, 0, 1, 1],
        [0, 0, 0, 0, 2, 1, 1, 3, 0, 0, 1, 2],
        [0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 1, 2],
        [1, 0, 0, 1, 0, 1, 1, 1, 2, 4, 0, 2]]])

Indexes: (3)

team

PandasIndex

PandasIndex(Index(['a', 'b', 'c', 'd', 'e', 'f'], dtype='object', name='team'))

chain

PandasIndex

PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))

draw

PandasIndex

PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='draw'))

Attributes: (0)

We can use map to apply the same function to all the 4 child DataArrays in ds, but this will not always be possible. When using .map, the function provided is applied to all child DataArrays with the same **kwargs.

If we try doing:

ds.map(stats.circmean, dims=("chain", "draw"))

Show code cell output Hide code cell output

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[6], line 1
----> 1 ds.map(stats.circmean, dims=("chain", "draw"))

File ~/checkouts/readthedocs.org/user_builds/xarray-einstats/envs/latest/lib/python3.10/site-packages/xarray/core/dataset.py:6931, in Dataset.map(self, func, keep_attrs, args, **kwargs)
if keep_attrs is None:
   keep_attrs = _get_keep_attrs(default=False)
-> 6931 variables = {
   k: maybe_wrap_array(v, func(v, *args, **kwargs))
   for k, v in self.data_vars.items()
}
if keep_attrs:
   for k, v in variables.items():

File ~/checkouts/readthedocs.org/user_builds/xarray-einstats/envs/latest/lib/python3.10/site-packages/xarray/core/dataset.py:6932, in <dictcomp>(.0)
if keep_attrs is None:
   keep_attrs = _get_keep_attrs(default=False)
variables = {
-> 6932     k: maybe_wrap_array(v, func(v, *args, **kwargs))
   for k, v in self.data_vars.items()
}
if keep_attrs:
   for k, v in variables.items():

File ~/checkouts/readthedocs.org/user_builds/xarray-einstats/envs/latest/lib/python3.10/site-packages/xarray_einstats/stats.py:498, in circmean(da, dims, high, low, nan_policy, **kwargs)
if nan_policy is not None:
   circmean_kwargs["nan_policy"] = nan_policy
--> 498 return _apply_reduce_func(stats.circmean, da, dims, kwargs, circmean_kwargs)

File ~/checkouts/readthedocs.org/user_builds/xarray-einstats/envs/latest/lib/python3.10/site-packages/xarray_einstats/stats.py:445, in _apply_reduce_func(func, da, dims, kwargs, func_kwargs)
else:
   core_dims = [dims]
--> 445 out_da = xr.apply_ufunc(
   func, da, input_core_dims=[core_dims], output_core_dims=[[]], kwargs=func_kwargs, **kwargs
)
return out_da

File ~/checkouts/readthedocs.org/user_builds/xarray-einstats/envs/latest/lib/python3.10/site-packages/xarray/core/computation.py:1267, in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, meta, dask_gufunc_kwargs, on_missing_core_dim, *args)
# feed DataArray apply_variable_ufunc through apply_dataarray_vfunc
elif any(isinstance(a, DataArray) for a in args):
-> 1267     return apply_dataarray_vfunc(
       variables_vfunc,
       *args,
       signature=signature,
       join=join,
       exclude_dims=exclude_dims,
       keep_attrs=keep_attrs,
   )
# feed Variables directly through apply_variable_ufunc
elif any(isinstance(a, Variable) for a in args):

File ~/checkouts/readthedocs.org/user_builds/xarray-einstats/envs/latest/lib/python3.10/site-packages/xarray/core/computation.py:315, in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, *args)
result_coords, result_indexes = build_output_coords_and_indexes(
   args, signature, exclude_dims, combine_attrs=keep_attrs
)
data_vars = [getattr(a, "variable", a) for a in args]
--> 315 result_var = func(*data_vars)
out: tuple[DataArray, ...] | DataArray
if signature.num_outputs > 1:

File ~/checkouts/readthedocs.org/user_builds/xarray-einstats/envs/latest/lib/python3.10/site-packages/xarray/core/computation.py:733, in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, vectorize, keep_attrs, dask_gufunc_kwargs, *args)
broadcast_dims = tuple(
   dim for dim in dim_sizes if dim not in signature.all_core_dims
)
output_dims = [broadcast_dims + out for out in signature.output_core_dims]
--> 733 input_data = [
   broadcast_compat_data(arg, broadcast_dims, core_dims)
   if isinstance(arg, Variable)
   else arg
   for arg, core_dims in zip(args, signature.input_core_dims)
]
if any(is_chunked_array(array) for array in input_data):
   if dask == "forbidden":

File ~/checkouts/readthedocs.org/user_builds/xarray-einstats/envs/latest/lib/python3.10/site-packages/xarray/core/computation.py:734, in <listcomp>(.0)
broadcast_dims = tuple(
   dim for dim in dim_sizes if dim not in signature.all_core_dims
)
output_dims = [broadcast_dims + out for out in signature.output_core_dims]
input_data = [
--> 734     broadcast_compat_data(arg, broadcast_dims, core_dims)
   if isinstance(arg, Variable)
   else arg
   for arg, core_dims in zip(args, signature.input_core_dims)
]
if any(is_chunked_array(array) for array in input_data):
   if dask == "forbidden":

File ~/checkouts/readthedocs.org/user_builds/xarray-einstats/envs/latest/lib/python3.10/site-packages/xarray/core/computation.py:680, in broadcast_compat_data(variable, broadcast_dims, core_dims)
reordered_dims = old_broadcast_dims + core_dims
if reordered_dims != old_dims:
--> 680     order = tuple(old_dims.index(d) for d in reordered_dims)
   data = duck_array_ops.transpose(data, order)
if new_dims != reordered_dims:

File ~/checkouts/readthedocs.org/user_builds/xarray-einstats/envs/latest/lib/python3.10/site-packages/xarray/core/computation.py:680, in <genexpr>(.0)
reordered_dims = old_broadcast_dims + core_dims
if reordered_dims != old_dims:
--> 680     order = tuple(old_dims.index(d) for d in reordered_dims)
   data = duck_array_ops.transpose(data, order)
if new_dims != reordered_dims:

ValueError: tuple.index(x): x not in tuple

we get an exception. The chain and draw dimensions are not present in all child DataArrays. Instead, we could apply it only to the variables that have both chain and dim dimensions.

ds_samples = ds[["mu", "sigma", "score"]]
ds_samples.map(stats.circmean, dims=("chain", "draw"))

<xarray.Dataset>
Dimensions:  (team: 6, match: 12)
Coordinates:
  * team     (team) <U1 'a' 'b' 'c' 'd' 'e' 'f'
Dimensions without coordinates: match
Data variables:
    mu       (team) float64 0.8221 0.7376 0.6195 0.7485 0.7439 0.7818
    sigma    float64 0.8134
    score    (match) float64 0.7441 0.3923 0.9316 0.6107 ... 0.5814 0.9538 0.94

xarray.Dataset

Dimensions:
- team: 6
- match: 12
Coordinates: (1)
- team
  (team)
  <U1
  'a' 'b' 'c' 'd' 'e' 'f'
```
array(['a', 'b', 'c', 'd', 'e', 'f'], dtype='<U1')
```

Data variables: (3)

(team)

float64

0.8221 0.7376 ... 0.7439 0.7818

array([0.82207192, 0.73762423, 0.61947925, 0.7484736 , 0.74392437,
       0.78178738])

sigma
()
float64
0.8134
```
array(0.81344406)
```

score

(match)

float64

0.7441 0.3923 ... 0.9538 0.94

array([0.74414412, 0.39225958, 0.93162181, 0.61074457, 0.68761995,
       0.78649793, 0.94400198, 0.7869019 , 0.56637618, 0.58141021,
       0.95380374, 0.93997363])

Indexes: (1)

team

PandasIndex

PandasIndex(Index(['a', 'b', 'c', 'd', 'e', 'f'], dtype='object', name='team'))

Attributes: (0)

Attention

In general, you should prefer using .map attribute over using non-DataArray objects as input to the xarray_einstats directly. .map will ensure no unexpected broadcasting between the multiple child DataArrays takes place. See the examples below for some examples.

However, if you are using functions that reduce dimensions on non-DataArray inputs whose child DataArrays all have all the dimensions to reduce you will not trigger any such broadcasting, and we have included that behaviour on our test suite to ensure it stays this way.

It is also possible to do

stats.circmean(ds_samples, dims=("chain", "draw"))

<xarray.Dataset>
Dimensions:  (team: 6, match: 12)
Coordinates:
  * team     (team) <U1 'a' 'b' 'c' 'd' 'e' 'f'
Dimensions without coordinates: match
Data variables:
    mu       (team) float64 0.8221 0.7376 0.6195 0.7485 0.7439 0.7818
    sigma    float64 0.8134
    score    (match) float64 0.7441 0.3923 0.9316 0.6107 ... 0.5814 0.9538 0.94

xarray.Dataset

Dimensions:
- team: 6
- match: 12
Coordinates: (1)
- team
  (team)
  <U1
  'a' 'b' 'c' 'd' 'e' 'f'
```
array(['a', 'b', 'c', 'd', 'e', 'f'], dtype='<U1')
```

Data variables: (3)

(team)

float64

0.8221 0.7376 ... 0.7439 0.7818

array([0.82207192, 0.73762423, 0.61947925, 0.7484736 , 0.74392437,
       0.78178738])

sigma
()
float64
0.8134
```
array(0.81344406)
```

score

(match)

float64

0.7441 0.3923 ... 0.9538 0.94

array([0.74414412, 0.39225958, 0.93162181, 0.61074457, 0.68761995,
       0.78649793, 0.94400198, 0.7869019 , 0.56637618, 0.58141021,
       0.95380374, 0.93997363])

Indexes: (1)

team

PandasIndex

PandasIndex(Index(['a', 'b', 'c', 'd', 'e', 'f'], dtype='object', name='team'))

Attributes: (0)

Here, all child DataArrays have both chain and draw dimension, so as expected, the result is the same. There are some cases however, in which not using .map triggers some broadcasting operations which will generally not be the desired output.

If we use the .map attribute, the function is applied to each child DataArray independently from the others:

ds.map(stats.rankdata)

<xarray.Dataset>
Dimensions:  (plot_dim: 20, chain: 4, draw: 10, team: 6, match: 12)
Dimensions without coordinates: plot_dim, chain, draw, team, match
Data variables:
    x_plot   (plot_dim) float64 1.0 2.0 3.0 4.0 5.0 ... 16.0 17.0 18.0 19.0 20.0
    mu       (chain, draw, team) float64 65.0 41.0 89.0 ... 55.0 97.0 205.0
    sigma    (chain, draw) float64 33.0 30.0 15.0 5.0 ... 4.0 18.0 31.0 29.0
    score    (chain, draw, match) float64 105.0 401.0 457.0 ... 105.0 401.0

xarray.Dataset

Dimensions:
- plot_dim: 20
- chain: 4
- draw: 10
- team: 6
- match: 12
Coordinates: (0)

Data variables: (4)

x_plot

(plot_dim)

float64

1.0 2.0 3.0 4.0 ... 18.0 19.0 20.0

array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15., 16., 17., 18., 19., 20.])

(chain, draw, team)

float64

65.0 41.0 89.0 ... 55.0 97.0 205.0

array([[[ 65.,  41.,  89., 101.,  49., 219.],
        [182.,  45., 185., 160.,  68.,   3.],
        [126., 152.,  16., 168., 106., 231.],
        [ 58., 184.,  22., 174.,   5.,  51.],
        [ 93., 149.,   4.,  88., 229.,  21.],
        [230., 117.,  44.,  23.,  43., 111.],
        [200., 105.,   9., 124.,  37., 114.],
        [163., 191., 112.,  57., 204., 146.],
        [ 59.,  50.,  26.,  84., 102., 228.],
        [104.,  90., 116., 221., 190.,  74.]],

       [[ 29., 223., 224., 148., 133., 195.],
        [194.,  70.,  31., 119.,  27.,  75.],
        [233.,  87., 235., 110., 150.,  18.],
        [  8.,   2., 220.,  48., 196.,  95.],
        [236.,  10., 147., 128.,  77., 201.],
        [193., 118.,  36., 175.,  91.,  46.],
        [178.,  86., 206., 123.,  15., 217.],
        [227., 171., 215., 202.,  80., 134.],
        [172., 214., 122.,  13., 176., 164.],
...
        [ 64.,   7., 192.,  33., 161., 143.],
        [ 72.,  83., 159., 100., 213., 154.],
        [151., 226.,  38., 177., 157., 103.],
        [198.,  63.,  14.,  85.,  42., 130.],
        [209., 179.,  12., 153., 218., 170.],
        [173.,  94.,  34., 181., 145.,  25.],
        [141., 237.,  56., 222., 144.,  17.],
        [ 92., 121.,  54., 115., 113., 155.],
        [ 28.,  81., 186., 183., 120., 139.]],

       [[165.,  78.,  66., 225.,  76., 108.],
        [ 30., 142., 199., 240.,  62., 239.],
        [207.,  19., 216.,  96., 187.,  79.],
        [  6., 234.,  73., 129.,  61.,  52.],
        [127., 210.,  60.,  82., 137., 188.],
        [ 39., 138.,  99.,  71.,  69., 232.],
        [109., 166., 203., 167.,  24., 135.],
        [ 98.,  35.,  53., 169., 211., 131.],
        [156.,  11., 238.,  40., 132., 158.],
        [ 67., 212.,  20.,  55.,  97., 205.]]])

sigma

(chain, draw)

float64

33.0 30.0 15.0 ... 18.0 31.0 29.0

array([[33., 30., 15.,  5., 26., 34.,  9., 28., 17., 21.],
       [25., 23., 22., 10., 32., 35.,  6., 16., 12.,  8.],
       [13., 11., 38., 14., 20., 19.,  2., 27.,  1., 24.],
       [40., 36., 37.,  3., 39.,  7.,  4., 18., 31., 29.]])

score

(chain, draw, match)

float64

105.0 401.0 457.0 ... 105.0 401.0

array([[[105. , 401. , 457. , 105. , 105. , 105. , 105. , 105. , 401. ,
         105. , 283. , 105. ],
        [283. , 283. , 105. , 283. , 105. , 105. , 105. , 401. , 105. ,
         283. , 283. , 474.5],
        [283. , 283. , 105. , 283. , 283. , 105. , 401. , 283. , 105. ,
         283. , 283. , 283. ],
        [283. , 105. , 105. , 105. , 105. , 401. , 283. , 105. , 105. ,
         283. , 105. , 283. ],
        [105. , 105. , 401. , 283. , 105. , 283. , 401. , 105. , 283. ,
         105. , 401. , 105. ],
        [105. , 105. , 457. , 105. , 283. , 105. , 283. , 283. , 283. ,
         457. , 457. , 401. ],
        [401. , 105. , 105. , 283. , 401. , 401. , 283. , 401. , 105. ,
         283. , 283. , 105. ],
        [283. , 105. , 457. , 105. , 105. , 105. , 283. , 105. , 105. ,
         474.5, 105. , 105. ],
        [457. , 401. , 283. , 283. , 401. , 283. , 283. , 401. , 105. ,
         105. , 283. , 401. ],
        [283. , 105. , 105. , 105. , 105. , 283. , 105. , 401. , 401. ,
         283. , 105. , 401. ]],
...
       [[283. , 401. , 401. , 283. , 105. , 105. , 283. , 283. , 105. ,
         105. , 105. , 105. ],
        [105. , 105. , 283. , 105. , 283. , 283. , 474.5, 105. , 105. ,
         105. , 457. , 401. ],
        [401. , 105. , 401. , 105. , 105. , 105. , 401. , 105. , 283. ,
         283. , 283. , 401. ],
        [105. , 474.5, 105. , 105. , 401. , 401. , 105. , 283. , 105. ,
         283. , 474.5, 105. ],
        [401. , 283. , 283. , 105. , 105. , 105. , 401. , 105. , 105. ,
         457. , 457. , 105. ],
        [105. , 105. , 457. , 105. , 283. , 105. , 105. , 283. , 283. ,
         105. , 283. , 457. ],
        [283. , 283. , 105. , 283. , 283. , 401. , 401. , 283. , 105. ,
         105. , 283. , 283. ],
        [105. , 105. , 105. , 105. , 401. , 283. , 283. , 457. , 105. ,
         105. , 283. , 401. ],
        [105. , 105. , 105. , 401. , 105. , 105. , 105. , 105. , 105. ,
         105. , 283. , 401. ],
        [283. , 105. , 105. , 283. , 105. , 283. , 283. , 283. , 401. ,
         474.5, 105. , 401. ]]])

Indexes: (0)
Attributes: (0)

whereas without using the .map attribute, extra broadcasting can happen:

stats.rankdata(ds)

<xarray.Dataset>
Dimensions:  (plot_dim: 20, chain: 4, draw: 10, team: 6, match: 12)
Dimensions without coordinates: plot_dim, chain, draw, team, match
Data variables:
    x_plot   (plot_dim, chain, draw, team, match) float64 1.44e+03 ... 5.616e+04
    mu       (plot_dim, chain, draw, team, match) float64 1.548e+04 ... 4.908...
    sigma    (plot_dim, chain, draw, team, match) float64 4.68e+04 ... 4.104e+04
    score    (plot_dim, chain, draw, team, match) float64 1.254e+04 ... 4.806...

xarray.Dataset

Dimensions:
- plot_dim: 20
- chain: 4
- draw: 10
- team: 6
- match: 12
Coordinates: (0)

Data variables: (4)

x_plot

(plot_dim, chain, draw, team, match)

float64

1.44e+03 1.44e+03 ... 5.616e+04

array([[[[[ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5]],

         [[ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5]],

         [[ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5],
          [ 1440.5,  1440.5,  1440.5, ...,  1440.5,  1440.5,  1440.5]],
...
         [[56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5]],

         [[56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5]],

         [[56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5],
          [56160.5, 56160.5, 56160.5, ..., 56160.5, 56160.5, 56160.5]]]]])

(plot_dim, chain, draw, team, match)

float64

1.548e+04 1.548e+04 ... 4.908e+04

array([[[[[15480.5, 15480.5, 15480.5, ..., 15480.5, 15480.5, 15480.5],
          [ 9720.5,  9720.5,  9720.5, ...,  9720.5,  9720.5,  9720.5],
          [21240.5, 21240.5, 21240.5, ..., 21240.5, 21240.5, 21240.5],
          [24120.5, 24120.5, 24120.5, ..., 24120.5, 24120.5, 24120.5],
          [11640.5, 11640.5, 11640.5, ..., 11640.5, 11640.5, 11640.5],
          [52440.5, 52440.5, 52440.5, ..., 52440.5, 52440.5, 52440.5]],

         [[43560.5, 43560.5, 43560.5, ..., 43560.5, 43560.5, 43560.5],
          [10680.5, 10680.5, 10680.5, ..., 10680.5, 10680.5, 10680.5],
          [44280.5, 44280.5, 44280.5, ..., 44280.5, 44280.5, 44280.5],
          [38280.5, 38280.5, 38280.5, ..., 38280.5, 38280.5, 38280.5],
          [16200.5, 16200.5, 16200.5, ..., 16200.5, 16200.5, 16200.5],
          [  600.5,   600.5,   600.5, ...,   600.5,   600.5,   600.5]],

         [[30120.5, 30120.5, 30120.5, ..., 30120.5, 30120.5, 30120.5],
          [36360.5, 36360.5, 36360.5, ..., 36360.5, 36360.5, 36360.5],
          [ 3720.5,  3720.5,  3720.5, ...,  3720.5,  3720.5,  3720.5],
          [40200.5, 40200.5, 40200.5, ..., 40200.5, 40200.5, 40200.5],
          [25320.5, 25320.5, 25320.5, ..., 25320.5, 25320.5, 25320.5],
          [55320.5, 55320.5, 55320.5, ..., 55320.5, 55320.5, 55320.5]],
...
         [[23400.5, 23400.5, 23400.5, ..., 23400.5, 23400.5, 23400.5],
          [ 8280.5,  8280.5,  8280.5, ...,  8280.5,  8280.5,  8280.5],
          [12600.5, 12600.5, 12600.5, ..., 12600.5, 12600.5, 12600.5],
          [40440.5, 40440.5, 40440.5, ..., 40440.5, 40440.5, 40440.5],
          [50520.5, 50520.5, 50520.5, ..., 50520.5, 50520.5, 50520.5],
          [31320.5, 31320.5, 31320.5, ..., 31320.5, 31320.5, 31320.5]],

         [[37320.5, 37320.5, 37320.5, ..., 37320.5, 37320.5, 37320.5],
          [ 2520.5,  2520.5,  2520.5, ...,  2520.5,  2520.5,  2520.5],
          [57000.5, 57000.5, 57000.5, ..., 57000.5, 57000.5, 57000.5],
          [ 9480.5,  9480.5,  9480.5, ...,  9480.5,  9480.5,  9480.5],
          [31560.5, 31560.5, 31560.5, ..., 31560.5, 31560.5, 31560.5],
          [37800.5, 37800.5, 37800.5, ..., 37800.5, 37800.5, 37800.5]],

         [[15960.5, 15960.5, 15960.5, ..., 15960.5, 15960.5, 15960.5],
          [50760.5, 50760.5, 50760.5, ..., 50760.5, 50760.5, 50760.5],
          [ 4680.5,  4680.5,  4680.5, ...,  4680.5,  4680.5,  4680.5],
          [13080.5, 13080.5, 13080.5, ..., 13080.5, 13080.5, 13080.5],
          [23160.5, 23160.5, 23160.5, ..., 23160.5, 23160.5, 23160.5],
          [49080.5, 49080.5, 49080.5, ..., 49080.5, 49080.5, 49080.5]]]]])

sigma

(plot_dim, chain, draw, team, match)

float64

4.68e+04 4.68e+04 ... 4.104e+04

array([[[[[46800.5, 46800.5, 46800.5, ..., 46800.5, 46800.5, 46800.5],
          [46800.5, 46800.5, 46800.5, ..., 46800.5, 46800.5, 46800.5],
          [46800.5, 46800.5, 46800.5, ..., 46800.5, 46800.5, 46800.5],
          [46800.5, 46800.5, 46800.5, ..., 46800.5, 46800.5, 46800.5],
          [46800.5, 46800.5, 46800.5, ..., 46800.5, 46800.5, 46800.5],
          [46800.5, 46800.5, 46800.5, ..., 46800.5, 46800.5, 46800.5]],

         [[42480.5, 42480.5, 42480.5, ..., 42480.5, 42480.5, 42480.5],
          [42480.5, 42480.5, 42480.5, ..., 42480.5, 42480.5, 42480.5],
          [42480.5, 42480.5, 42480.5, ..., 42480.5, 42480.5, 42480.5],
          [42480.5, 42480.5, 42480.5, ..., 42480.5, 42480.5, 42480.5],
          [42480.5, 42480.5, 42480.5, ..., 42480.5, 42480.5, 42480.5],
          [42480.5, 42480.5, 42480.5, ..., 42480.5, 42480.5, 42480.5]],

         [[20880.5, 20880.5, 20880.5, ..., 20880.5, 20880.5, 20880.5],
          [20880.5, 20880.5, 20880.5, ..., 20880.5, 20880.5, 20880.5],
          [20880.5, 20880.5, 20880.5, ..., 20880.5, 20880.5, 20880.5],
          [20880.5, 20880.5, 20880.5, ..., 20880.5, 20880.5, 20880.5],
          [20880.5, 20880.5, 20880.5, ..., 20880.5, 20880.5, 20880.5],
          [20880.5, 20880.5, 20880.5, ..., 20880.5, 20880.5, 20880.5]],
...
         [[25200.5, 25200.5, 25200.5, ..., 25200.5, 25200.5, 25200.5],
          [25200.5, 25200.5, 25200.5, ..., 25200.5, 25200.5, 25200.5],
          [25200.5, 25200.5, 25200.5, ..., 25200.5, 25200.5, 25200.5],
          [25200.5, 25200.5, 25200.5, ..., 25200.5, 25200.5, 25200.5],
          [25200.5, 25200.5, 25200.5, ..., 25200.5, 25200.5, 25200.5],
          [25200.5, 25200.5, 25200.5, ..., 25200.5, 25200.5, 25200.5]],

         [[43920.5, 43920.5, 43920.5, ..., 43920.5, 43920.5, 43920.5],
          [43920.5, 43920.5, 43920.5, ..., 43920.5, 43920.5, 43920.5],
          [43920.5, 43920.5, 43920.5, ..., 43920.5, 43920.5, 43920.5],
          [43920.5, 43920.5, 43920.5, ..., 43920.5, 43920.5, 43920.5],
          [43920.5, 43920.5, 43920.5, ..., 43920.5, 43920.5, 43920.5],
          [43920.5, 43920.5, 43920.5, ..., 43920.5, 43920.5, 43920.5]],

         [[41040.5, 41040.5, 41040.5, ..., 41040.5, 41040.5, 41040.5],
          [41040.5, 41040.5, 41040.5, ..., 41040.5, 41040.5, 41040.5],
          [41040.5, 41040.5, 41040.5, ..., 41040.5, 41040.5, 41040.5],
          [41040.5, 41040.5, 41040.5, ..., 41040.5, 41040.5, 41040.5],
          [41040.5, 41040.5, 41040.5, ..., 41040.5, 41040.5, 41040.5],
          [41040.5, 41040.5, 41040.5, ..., 41040.5, 41040.5, 41040.5]]]]])

score

(plot_dim, chain, draw, team, match)

float64

1.254e+04 4.806e+04 ... 4.806e+04

array([[[[[12540.5, 48060.5, 54780.5, ..., 12540.5, 33900.5, 12540.5],
          [12540.5, 48060.5, 54780.5, ..., 12540.5, 33900.5, 12540.5],
          [12540.5, 48060.5, 54780.5, ..., 12540.5, 33900.5, 12540.5],
          [12540.5, 48060.5, 54780.5, ..., 12540.5, 33900.5, 12540.5],
          [12540.5, 48060.5, 54780.5, ..., 12540.5, 33900.5, 12540.5],
          [12540.5, 48060.5, 54780.5, ..., 12540.5, 33900.5, 12540.5]],

         [[33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 56880.5],
          [33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 56880.5],
          [33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 56880.5],
          [33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 56880.5],
          [33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 56880.5],
          [33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 56880.5]],

         [[33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 33900.5],
          [33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 33900.5],
          [33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 33900.5],
          [33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 33900.5],
          [33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 33900.5],
          [33900.5, 33900.5, 12540.5, ..., 33900.5, 33900.5, 33900.5]],
...
         [[12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5],
          [12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5],
          [12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5],
          [12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5],
          [12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5],
          [12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5]],

         [[12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5],
          [12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5],
          [12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5],
          [12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5],
          [12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5],
          [12540.5, 12540.5, 12540.5, ..., 12540.5, 33900.5, 48060.5]],

         [[33900.5, 12540.5, 12540.5, ..., 56880.5, 12540.5, 48060.5],
          [33900.5, 12540.5, 12540.5, ..., 56880.5, 12540.5, 48060.5],
          [33900.5, 12540.5, 12540.5, ..., 56880.5, 12540.5, 48060.5],
          [33900.5, 12540.5, 12540.5, ..., 56880.5, 12540.5, 48060.5],
          [33900.5, 12540.5, 12540.5, ..., 56880.5, 12540.5, 48060.5],
          [33900.5, 12540.5, 12540.5, ..., 56880.5, 12540.5, 48060.5]]]]])

Indexes: (0)
Attributes: (0)

The behaviour on DataArrayGroupBy for example is very similar to the examples we have shown for Datasets:

da = ds["mu"].assign_coords(team=["a", "b", "b", "a", "c", "b"])
da

<xarray.DataArray 'mu' (chain: 4, draw: 10, team: 6)>
0.2691 0.1617 0.4371 0.4885 0.1836 2.149 ... 2.037 0.09032 0.2221 0.4673 1.844
Coordinates:
  * chain    (chain) int64 0 1 2 3
  * draw     (draw) int64 0 1 2 3 4 5 6 7 8 9
  * team     (team) <U1 'a' 'b' 'b' 'a' 'c' 'b'

xarray.DataArray

'mu'

chain: 4
draw: 10
team: 6

0.2691 0.1617 0.4371 0.4885 0.1836 ... 0.09032 0.2221 0.4673 1.844

array([[[2.69056877e-01, 1.61668482e-01, 4.37076245e-01, 4.88462056e-01,
         1.83607259e-01, 2.14929287e+00],
        [1.29624769e+00, 1.70485755e-01, 1.38613151e+00, 1.01921148e+00,
         2.89694957e-01, 2.48617359e-02],
        [6.73427333e-01, 9.46714172e-01, 7.36787238e-02, 1.09855063e+00,
         5.15092560e-01, 2.83864438e+00],
        [2.41990347e-01, 1.35565557e+00, 9.16673567e-02, 1.18105639e+00,
         2.79089061e-02, 1.94730217e-01],
        [4.52275630e-01, 9.12053418e-01, 2.52859833e-02, 4.33794112e-01,
         2.66568605e+00, 9.14395433e-02],
        [2.78496514e+00, 6.20305318e-01, 1.69758655e-01, 9.47409996e-02,
         1.68864061e-01, 5.51798341e-01],
        [1.69640367e+00, 5.08487198e-01, 5.62196221e-02, 6.68795534e-01,
         1.44256318e-01, 5.88747135e-01],
        [1.04914653e+00, 1.46754748e+00, 5.57365643e-01, 2.40870388e-01,
         1.83910661e+00, 8.59339439e-01],
        [2.42143301e-01, 1.89020741e-01, 1.02458611e-01, 4.16245545e-01,
         4.93556381e-01, 2.63486688e+00],
        [5.01173448e-01, 4.37304515e-01, 6.18764350e-01, 2.22065028e+00,
         1.46706875e+00, 3.22272355e-01]],
...
       [[1.07599614e+00, 3.69886350e-01, 2.72748601e-01, 2.43953936e+00,
         3.46216424e-01, 5.28820714e-01],
        [1.20773413e-01, 7.85535681e-01, 1.68790868e+00, 6.19088424e+00,
         2.63122594e-01, 4.79932328e+00],
        [1.87551311e+00, 8.86396999e-02, 2.12173963e+00, 4.63148585e-01,
         1.43172907e+00, 3.86750667e-01],
        [3.43250781e-02, 3.46138762e+00, 3.10064389e-01, 6.84710112e-01,
         2.57258901e-01, 2.04497762e-01],
        [6.76815757e-01, 1.96472446e+00, 2.51044085e-01, 3.99684270e-01,
         7.43170711e-01, 1.44297307e+00],
        [1.59314776e-01, 7.56079345e-01, 4.81973284e-01, 3.00770806e-01,
         2.90218388e-01, 2.88869462e+00],
        [5.34800924e-01, 1.08179942e+00, 1.81874956e+00, 1.09219518e+00,
         9.87210790e-02, 7.32145097e-01],
        [4.79654858e-01, 1.42189970e-01, 2.11525757e-01, 1.10643383e+00,
         1.97384347e+00, 7.09473650e-01],
        [1.00792902e+00, 5.78436486e-02, 4.70647603e+00, 1.59408169e-01,
         7.20011514e-01, 1.01756898e+00],
        [2.78376785e-01, 2.03673267e+00, 9.03159293e-02, 2.22115565e-01,
         4.67319362e-01, 1.84429285e+00]]])

Coordinates: (3)
- chain
  (chain)
  int64
  0 1 2 3
```
array([0, 1, 2, 3])
```
- draw
  (draw)
  int64
  0 1 2 3 4 5 6 7 8 9
```
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
```
- team
  (team)
  <U1
  'a' 'b' 'b' 'a' 'c' 'b'
```
array(['a', 'b', 'b', 'a', 'c', 'b'], dtype='<U1')
```

Indexes: (3)

chain

PandasIndex

PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))

draw

PandasIndex

PandasIndex(Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype='int64', name='draw'))

team

PandasIndex

PandasIndex(Index(['a', 'b', 'b', 'a', 'c', 'b'], dtype='object', name='team'))

Attributes: (0)

when we apply a “group by” operation over the team dimension, we generate a DataArrayGroupBy with 3 groups.

gb = da.groupby("team")
gb

DataArrayGroupBy, grouped over 'team'
3 groups with labels 'a', 'b', 'c'.

on which we can use .map to apply a function from xarray-einstats over all groups independently:

gb.map(stats.median_abs_deviation, dims=["draw", "team"])

<xarray.DataArray 'mu' (chain: 4, team: 3)>
0.3436 0.3758 0.2351 0.5221 0.5937 0.4158 ... 0.4314 0.212 0.3479 0.5708 0.2288
Coordinates:
  * chain    (chain) int64 0 1 2 3
  * team     (team) object 'a' 'b' 'c'

xarray.DataArray

'mu'

chain: 4
team: 3

0.3436 0.3758 0.2351 0.5221 0.5937 ... 0.212 0.3479 0.5708 0.2288

array([[0.34355412, 0.37583287, 0.23506548],
       [0.52214385, 0.59374561, 0.41577423],
       [0.45529942, 0.4314101 , 0.21200485],
       [0.34786642, 0.57076836, 0.2287779 ]])

Coordinates: (2)
- chain
  (chain)
  int64
  0 1 2 3
```
array([0, 1, 2, 3])
```
- team
  (team)
  object
  'a' 'b' 'c'
```
array(['a', 'b', 'c'], dtype=object)
```

Indexes: (2)

chain

PandasIndex

PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))

team

PandasIndex

PandasIndex(Index(['a', 'b', 'c'], dtype='object', name='team'))

Attributes: (0)

which as expected has performed the operation group-wise, yielding a different result than either

stats.median_abs_deviation(da, dims=["draw", "team"])

<xarray.DataArray 'mu' (chain: 4)>
0.3444 0.5968 0.4553 0.4069
Coordinates:
  * chain    (chain) int64 0 1 2 3

xarray.DataArray

'mu'

chain: 4

0.3444 0.5968 0.4553 0.4069

array([0.34440251, 0.59679036, 0.45529942, 0.40694066])

Coordinates: (1)
- chain
  (chain)
  int64
  0 1 2 3
```
array([0, 1, 2, 3])
```

Indexes: (1)

chain

PandasIndex

PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))

Attributes: (0)

stats.median_abs_deviation(da, dims="draw")

<xarray.DataArray 'mu' (chain: 4, team: 6)>
0.3452 0.3788 0.09536 0.3892 0.2351 ... 0.6554 0.2451 0.3832 0.2288 0.5281
Coordinates:
  * chain    (chain) int64 0 1 2 3
  * team     (team) <U1 'a' 'b' 'b' 'a' 'c' 'b'

xarray.DataArray

'mu'

chain: 4
team: 6

0.3452 0.3788 0.09536 0.3892 0.2351 ... 0.2451 0.3832 0.2288 0.5281

array([[0.34523357, 0.37884672, 0.09535583, 0.38917055, 0.23506548,
        0.42718786],
       [0.90764902, 0.49198442, 0.87907897, 0.3558338 , 0.41577423,
        0.44923513],
       [0.36423171, 0.39081023, 0.15527351, 0.4799553 , 0.21200485,
        0.26347783],
       [0.3671838 , 0.65539268, 0.24509799, 0.38316748, 0.2287779 ,
        0.5281112 ]])

Coordinates: (2)
- chain
  (chain)
  int64
  0 1 2 3
```
array([0, 1, 2, 3])
```
- team
  (team)
  <U1
  'a' 'b' 'b' 'a' 'c' 'b'
```
array(['a', 'b', 'b', 'a', 'c', 'b'], dtype='<U1')
```

Indexes: (2)

chain

PandasIndex

PandasIndex(Index([0, 1, 2, 3], dtype='int64', name='chain'))

team

PandasIndex

PandasIndex(Index(['a', 'b', 'b', 'a', 'c', 'b'], dtype='object', name='team'))

Attributes: (0)

Getting started#

Welcome to xarray-einstats!#

Overview#

Using xarray-einstats#

DataArray inputs#

Dataset and GroupBy inputs#

Welcome to `xarray-einstats`!#

Using `xarray-einstats`#