spatialstats.polyspectra

To use GPU routines, run

>>> import spatialstats as ss
>>> ss.config.gpu = True

Bispectrum

Implementation using Numba acceleration.

bispectrum(*u, ntheta=None, kmin=None, kmax=None, diagnostics=True, error=False, nsamples=None, sample_thresh=None, compute_fft=True, exclude_upper=False, use_pyfftw=False, bench=False, progress=False, **kwargs)

Estimate the bispectrum \(B(k_1, k_2, \theta)\) and bicoherence index \(b(k_1, k_2, \theta)\) of a real scalar or vector field \(u\).

Assuming statistical homogeneity, the bispectrum \(B(\mathbf{k}_1, \mathbf{k}_2, \mathbf{k}_3)\) is defined as the 3-point correlation function in Fourier space with \(\mathbf{k}_1 + \mathbf{k}_2 + \mathbf{k}_3 = 0\). For a real \(u\)

\[B(\mathbf{k}_1, \mathbf{k}_2, - \mathbf{k}_1 - \mathbf{k}_2) = \langle\tilde{u}(\mathbf{k}_1)\tilde{u}(\mathbf{k}_2) \tilde{u}^{*}(\mathbf{k}_1+\mathbf{k}_2)\rangle,\]

where \(\tilde{u}\) is the Fourier transform of \(u\) and \(\langle\cdot\rangle\) is some ensemble average. We compute our bispectrum as

\[B(k_1, k_2, \theta) = \frac{1}{V_1 V_2} \int\int_{\Omega} d^D \mathbf{k}_1 d^D \mathbf{k}_2 \ \tilde{u}(\mathbf{k}_1)\tilde{u}(\mathbf{k}_2) \tilde{u}^{*}(\mathbf{k}_1 + \mathbf{k}_2),\]

and the bicoherence as

\[b(k_1, k_2, \theta) = \frac{ |\int\int_{\Omega} d^D \mathbf{k}_1 d^D \mathbf{k}_2 \ \tilde{u}(\mathbf{k}_1)\tilde{u}(\mathbf{k}_2) \tilde{u}^{*}(\mathbf{k}_1 + \mathbf{k}_2)|}{ \int\int_{\Omega} d^D \mathbf{k}_1 d^D \mathbf{k}_2 \ |\tilde{u}(\mathbf{k}_1)\tilde{u}(\mathbf{k}_2) \tilde{u}^{*}(\mathbf{k}_1 + \mathbf{k}_2)|}.\]

\(\Omega\) is the set of all unique (\(\mathbf{k}_1\), \(\mathbf{k}_2\)) pairs such that \(|\mathbf{k}_1| \in [k_1, k_1+1)\), \(|\mathbf{k}_2| \in [k_2, k_2+1)\), and \(\arccos{(\hat{\mathbf{k}}_1 \cdot \hat{\mathbf{k}}_2)} \in [\theta, \theta+\Delta \theta)\). By “unique” pairs, we mean (\(\mathbf{k}_1\), \(\mathbf{k}_2\)) but not the complex conjugate evaluations for (\(-\mathbf{k}_1\), \(-\mathbf{k}_2\)). Otherwise, \(B\) would be a real function.

If the data is also statistically isotropic, then we can say that the bispectrum is only a function of scalar wavenumber, \(B = B(k_1, k_2, k_3)\). In this case, \(B\) accounts for all degrees of freedom of the bispectrum. Use this implementation’s variance estimates on the average over \(\Omega\) to test this assumption.

To calculate \(B\), we take the average

\[B(k_1, k_2, \theta) = \frac{1}{|\Omega|} \sum\limits_{\Omega} \tilde{u}(\mathbf{k}_1)\tilde{u}(\mathbf{k}_2) \tilde{u}^{*}(\mathbf{k}_1 + \mathbf{k}_2),\]

where now \(\tilde{u}\) is an FFT. For 3D fields, the full sum is often too large to compute. Instead, we compute a naive Monte Carlo integration by drawing \(N\) uniform samples \((\mathbf{k}_1^n, \mathbf{k}_2^n)\) from the set \(\Omega\). This defines an unbiased estimator of \(B\),

\[\hat{B}(k_1, k_2, \theta) = \frac{1}{N} \sum\limits_{n = 1}^{N} \tilde{u}(\mathbf{k}_1^n)\tilde{u}(\mathbf{k}_2^n) \tilde{u}^{*}(\mathbf{k}_1^n + \mathbf{k}_2^n).\]

The same procedure is used to compute \(b\). By default, this implementation returns \(B(k_1, k_2)\), the mean bispectrum summed over triangle angle \(\theta\).

To learn more, read here.

Note

One can recover the sum over triangles by multiplying counts * B when nsamples = None. Or, if ntheta = None, evaulate omega * B.

Note

When considering the bispectrum as a function of triangle angle, mesh points may be set to np.nan depending on \(k_1, \ k_2\). For example, \(\theta = 0\) would yield np.nan for all \(k_1 + k_2 > \sqrt{2} \ k_{nyq}\), where \(k_{nyq}\) is the Nyquist frequency. Computing a boolean mask with np.isnan and reductions like np.nansum can be useful.

Note

Computing np.nansum(B*counts, axis=0)/np.sum(counts, axis=0) recovers the bispectrum summed over triangle angles. To recover the corresponding bicoherence, evaulate np.abs(np.nansum(B, axis=0)) / np.nansum(np.abs(B)/b, axis=0).

Parameters
  • u (np.ndarray) – Scalar or vector field. If vector data, pass arguments as u1, u2, ..., un, where ui is the ith vector component. Each ui can be 2D or 3D, and all must have the same ui.shape and ui.dtype. The vector bispectrum will be computed as the sum over bispectra of each component.

  • ntheta (int, optional) – Number of angular bins \(\theta\) between triangles formed by wavevectors \(\mathbf{k_1}, \ \mathbf{k_2}\). If None, sum over all triangle angles. Otherwise, return a bispectrum for each angular bin.

  • kmin (int, optional) – Minimum wavenumber in bispectrum calculation. If None, kmin = 1.

  • kmax (int, optional) – Maximum wavenumber in bispectrum calculation. If None, kmax = max(u.shape)//2

  • diagnostics (bool, optional) – Return the optional sampling and normalization diagnostics, documented below. Set error = True to also return the standard error of the mean.

  • error (bool, optional) – Return standard error of the mean of the Monte-Carlo integration. diagnostics = True must also be set. This will add a bit of time to the calculation.

  • nsamples (int, float or np.ndarray, shape (kmax-kmin+1, kmax-kmin+1), optional) – Number of sample triangles or fraction of total possible triangles. This may be an array that specifies for a given \(k_1, \ k_2\). If None, calculate the exact sum.

  • sample_thresh (int, optional) – When the size of the sample space is greater than this number, start to use sampling instead of exact calculation. If None, switch to exact calculation when nsamples is less than the size of the sample space.

  • compute_fft (bool, optional) – If False, do not take the FFT of the input data. FFTs should not be passed with the zero-frequency component in the center, and it should have the same shape as the real-space data.

  • exclude_upper (bool, optional) – If True, set points where \(k_1 + k_2 > k_{nyq}\) to np.nan. This keyword only applies when summing over angles, e.g. when ntheta is None.

  • use_pyfftw (bool, optional) – If True, use pyfftw instead of np.fft to compute FFTs.

  • bench (bool, optional) – If True, print calculation time.

  • progress (bool, optional) – Print progress bar of calculation.

  • kwargs – Additional keyword arguments passed to np.fft.fftn or pyfftw.builders.fftn.

Returns

  • B (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1)) – Bispectrum \(B(k_1, k_2, \theta)\).

  • b (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1)) – Bicoherence index \(b(k_1, k_2, \theta)\).

  • kn (np.ndarray, shape (kmax-kmin+1,)) – Left edges of wavenumber bins \(k_1\) or \(k_2\) along axis of bispectrum.

  • theta (np.ndarray, shape (m,), optional) – Left edges of angular bins \(\theta\), ranging from \([0, \ \pi)\).

  • counts (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1), optional) – Number of evaluations in the bispectrum sum, \(N\).

  • omega (np.ndarray, shape (kmax-kmin+1, kmax-kmin+1), optional) – Number of possible triangles in the sample space, \(|\Omega|\). This is implemented for if sampling were not restricted by the Nyquist frequency. Note that this is only implemented for ntheta = None.

  • stderr (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1), optional) – Standard error of the mean for each bin. This can be an error estimate for the Monte Carlo integration. To convert to the standard deviation, evaluate stderr * np.sqrt(counts).

Implementation using CuPy acceleration.

bispectrum(*u, ntheta=None, kmin=None, kmax=None, diagnostics=True, error=False, nsamples=None, sample_thresh=None, compute_fft=True, exclude_upper=False, double=True, blocksize=128, bench=False, progress=False, **kwargs)

See the documentation for the CPU version.

Parameters
  • u (np.ndarray or cp.ndarray) – Scalar or vector field. If vector data, pass arguments as u1, u2 or u1, u2, u3 where ui is the ith vector component. Each ui should be 2D or 3D (respectively), and must have the same ui.shape and ui.dtype. The vector bispectrum will be computed as the sum over bispectra of each component.

  • ntheta (int, optional) – Number of angular bins \(\theta\) between triangles formed by wavevectors \(\mathbf{k_1}, \ \mathbf{k_2}\). If None, sum over all triangle angles. Otherwise, return a bispectrum for each angular bin.

  • kmin (int, optional) – Minimum wavenumber in bispectrum calculation. If None, kmin = 1.

  • kmax (int, optional) – Maximum wavenumber in bispectrum calculation. If None, kmax = max(u.shape)//2.

  • diagnostics (bool, optional) – Return the optional sampling and normalization diagnostics, documented below. Set error = True to also return the standard error of the mean.

  • error (bool, optional) – Return standard error of the mean of the Monte-Carlo integration. diagnostics = True must also be set. This will add a bit of time to the calculation.

  • nsamples (int, float or np.ndarray, shape (kmax-kmin+1, kmax-kmin+1), optional) – Number of sample triangles or fraction of total possible triangles. This may be an array that specifies for a given \(k_1, \ k_2\). If None, calculate the exact sum.

  • sample_thresh (int, optional) – When the size of the sample space is greater than this number, start to use sampling instead of exact calculation. If None, switch to exact calculation when nsamples is less than the size of the sample space.

  • compute_fft (bool, optional) – If False, do not take the FFT of the input data. FFTs should not be passed with the zero-frequency component in the center.

  • exclude_upper (bool, optional) – If True, set points where \(k_1 + k_2 > k_{nyq}\) to np.nan. This keyword only applies when summing over angles, e.g. when ntheta is None.

  • double (bool, optional) – If False, do calculation in single precision.

  • blocksize (int, optional) – Number of threads per block for GPU kernels. The optimal value will vary depending on hardware.

  • progress (bool, optional) – Print progress bar of calculation.

  • bench (bool, optional) – If True, print calculation time.

  • kwargs – Additional keyword arguments passed to cupyx.scipy.fft.fftn.

Returns

  • B (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1)) – Bispectrum \(B(k_1, k_2, \theta)\).

  • b (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1)) – Bicoherence index \(b(k_1, k_2, \theta)\).

  • kn (np.ndarray, shape (kmax-kmin+1,)) – Wavenumbers \(k_1\) or \(k_2\) along axis of bispectrum.

  • theta (np.ndarray, shape (m,), optional) – Left edges of angular bins \(\theta\), ranging from \([0, \ \pi)\).

  • counts (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1), optional) – Number of evaluations in the bispectrum sum, \(N\).

  • omega (np.ndarray, shape (kmax-kmin+1, kmax-kmin+1), optional) – Number of possible triangles in the sample space, \(|\Omega|\). This is implemented for if sampling were not restricted by the Nyquist frequency. Note that this is only implemented for ntheta = None.

  • stderr (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1), optional) – Standard error of the mean for each bin. This can be an error estimate for the Monte Carlo integration. To convert to the standard deviation, evaluate stderr * np.sqrt(counts).

Power spectrum

powerspectrum(*u, average=True, diagnostics=False, kmin=None, kmax=None, npts=None, compute_fft=True, compute_sqr=True, use_pyfftw=False, bench=False, **kwargs)

Returns the isotropically averaged power spectrum \(P(k)\) of a real scalar or vector field \(u\). Assuming statistical homogeneity, the power spectrum is the 2-point correlation function in Fourier space with \(\mathbf{k} + \mathbf{k}' = 0\). We define this isotropically averaged power spectrum as

\[P(k) = \int\limits_{|\mathbf{k}| \in [k, \ k+\Delta k)} d^D \mathbf{k} \ |\tilde{u}(\mathbf{k})|^2,\]

where \(\tilde{u}\) is the FFT of \(u\).

We approximate this integral as

\[P(k) = \frac{V_k}{N_k} \sum\limits_{|\mathbf{k}| \in [k, k+\Delta k)} |\tilde{u}(\mathbf{k})|^2,\]

where \(V_k\) and \(N_k\) are the volume and number of points in the \(k^{th}\) bin.

Parameters
  • u (np.ndarray) – Scalar or vector field. If passing vector data, pass arguments as u1, u2, ..., un where ui is the ith vector component. Each ui can be 1D, 2D, or 3D, and all must have the same ui.shape and ui.dtype.

  • average (bool, optional) – If True, average over values in a given bin and multiply by the bin volume. If False, compute the sum.

  • diagnostics (bool, optional) – Return the standard deviation and number of points in a particular radial bin.

  • kmin (int or float, optional) – Minimum wavenumber in power spectrum bins. If None, kmin = 1.

  • kmax (int or float, optional) – Maximum wavenumber in power spectrum bins. If None, kmax = max(u.shape)//2.

  • npts (int, optional) – Number of modes between kmin and kmax, inclusive. If None, npts = kmax-kmin+1.

  • compute_fft (bool, optional) – If False, do not take the FFT of the input data. FFTs should not be passed with the zero-frequency component in the center.

  • compute_sqr (bool, optional) – If False, sum the real part of the FFT. This can be useful for purely real FFTs, where the sign of the FFT is useful information. If True, take the square as usual.

  • use_pyfftw (bool, optional) – If True, use pyfftw instead of np.fft to compute FFTs.

  • bench (bool, optional) – Print message for time of calculation.

  • kwargs – Additional keyword arguments passed to np.fft.fftn, np.fft.rfftn, pyfftw.builders.fftn, or pyfftw.builders.rfftn.

Returns

  • spectrum (np.ndarray, shape (npts,)) – Radially averaged power spectrum \(P(k)\).

  • kn (np.ndarray, shape (npts,)) – Left edges of radial bins \(k\).

  • counts (np.ndarray, shape (npts,), optional) – Number of points \(N_k\) in each bin.

  • vol (np.ndarray, shape (npts,), optional) – Volume \(V_k\) of each bin.

  • stdev (np.ndarray, shape (npts,), optional) – Standard deviation multiplied with \(V_k\) in each bin.

Implementation using CuPy acceleration.

powerspectrum(*u, average=True, diagnostics=False, kmin=None, kmax=None, npts=None, compute_fft=True, compute_sqr=True, double=True, bench=False, **kwargs)

See the documentation for the CPU version.

Parameters
  • u (np.ndarray) – Scalar or vector field. If vector data, pass arguments as u1, u2, ..., un where ui is the ith vector component. Each ui can be 1D, 2D, or 3D, and all must have the same ui.shape and ui.dtype.

  • average (bool, optional) – If True, average over values in a given bin and multiply by the bin volume. If False, compute the sum.

  • diagnostics (bool, optional) – Return the standard deviation and number of points in a particular radial bin.

  • kmin (int or float, optional) – Minimum wavenumber in power spectrum bins. If None, kmin = 1.

  • kmax (int or float, optional) – Maximum wavenumber in power spectrum bins. If None, kmax = max(u.shape)//2.

  • npts (int, optional) – Number of modes between kmin and kmax, inclusive. If None, npts = kmax-kmin+1.

  • compute_fft (bool, optional) – If False, do not take the FFT of the input data. FFTs should not be passed with the zero-frequency component in the center.

  • compute_sqr (bool, optional) – If False, sum the real part of the FFT. This can be useful for purely real FFTs, where the sign of the FFT is useful information. If True, take the square as usual.

  • double (bool, optional) – If False, calculate FFTs in single precision. Useful for saving memory.

  • bench (bool, optional) – Print message for time of calculation.

  • kwargs – Additional keyword arguments passed to cupyx.scipy.fft.fftn or cupyx.scipy.fft.rfftn.

Returns

  • spectrum (np.ndarray, shape (npts,)) – Radially averaged power spectrum \(P(k)\).

  • kn (np.ndarray, shape (npts,)) – Left edges of radial bins \(k\).

  • counts (np.ndarray, shape (npts,), optional) – Number of points \(N_k\) in each bin.

  • vol (np.ndarray, shape (npts,), optional) – Volume \(V_k\) of each bin.

  • stdev (np.ndarray, shape (npts,), optional) – Standard deviation multiplied with \(V_k\) in each bin.