spatialstats.polyspectra

To use GPU accelerated routines, run

>>> import spatialstats as ss
>>> ss.config.gpu = True

Power spectrum

FFT estimator for power spectrum multipoles with respect to a local orientation vectors or polarity field.

fftpower(field, polarity, poles=0, **kwargs)

See the documentation for spatialstats.polyspectra.nufftpower.

This computes power spectrum multipoles for a scalar, vector, or tensor field \(\mathbf{w}(\mathbf{x})\) with respect to a polarity field \(\mathbf{p}(\mathbf{x})\).

Parameters
  • field (np.ndarray, shape (Nx, Ny, Nz, 3, …, 3)) – Scalar, vector, or tensor field \(\mathbf{w}(\mathbf{x})\).

  • polarity (np.ndarray, shape (Nx, Ny, Nz, 3) or (3,)) – Polarity field \(\mathbf{p}(\mathbf{x})\). Can also pass one direction \(\mathbf{p}\).

  • poles (int or list, optional) – Multipole indices \(\ell\).

  • kwargs – Additional keyword arguments passed to pyfftw.FFTW.

Returns

  • spectra (list of np.ndarray, shape (nk,)) – Multipoles \(\mathcal{P}_{\ell}(k)\).

  • k (np.ndarray, shape (nk,)) – Wavenumbers \(k\), where kmax = int(max(modes)/2) and nk = kmax+1. The zero mode is returned, so k = np.linspace(0, kmax, nk, endpoint=True).

  • Nk (np.ndarray, shape (nk,)) – Number of points in a wavenumber bin \(N_k\) with shell thickness \([k, k+1)\).

  • Vk (np.ndarray, shape (nk,)) – Volume of a wavenumber bin \(V_k\).

nufftpower(positions, orientations, weights, boxsize, modes, poles=0, **kwargs)

Compute power spectrum multipoles with respect to local orientation vectors using non-uniform FFTS in three-dimensions.

The non-uniform FFT transforms a field defined by

\[\mathbf{w}(\mathbf{x}) = \sum_i \mathbf{w}_i \delta(\mathbf{x} - \mathbf{x}_i),\]

with a corresponding polarity field

\[\mathbf{p}(\mathbf{x}) = \sum_i \mathbf{p}_i \delta(\mathbf{x} - \mathbf{x}_i).\]

The estimator for multipoles with respect to \(\mathbf{p}_i\cdot\hat{\mathbf{k}}\) then is

\[\mathcal{P}_{\ell}(k) = \frac{1}{V_k} \sum_{|\mathbf{k}| \in [k, k+1)} \sum_m Y_{\ell m}(\hat{\mathbf{k}}) (\textrm{FFT}[\mathbf{w}(\mathbf{x})] \cdot \textrm{FFT}[\mathbf{w}(\mathbf{x}) Y_{\ell m}(\mathbf{p}(\mathbf{x}))]^*),\]

where \(Y_{\ell m}\) are the real spherical harmonics, \(V_k\) is a bin volume, \(\mathbf{w}_i\) are weights, and \(\mathbf{p}_i\) are orientations at positions \(\mathbf{x}_i\).

The above is defined for a vector field \(\mathbf{w}(\mathbf{x})\), but the implementation is general for scalar and tensor fields.

Parameters
  • positions (np.ndarray, shape (N, 3)) – Position vectors \(\mathbf{x}_i\).

  • orientations (np.ndarray, shape (N, 3)) – Orientation vectors \(\mathbf{p}_i\).

  • weights (np.ndarray, shape (N, 3, …, 3)) – Scalar, vector, or tensor weights \(\mathbf{w}_i\) or field \(\mathbf{w}(\mathbf{x})\). For example, taking \(w_i^{mn} = p_i^m p_i^n\) computes the nematic order power spectrum. If weights = 1., computes the power spectrum of the distribution function \(g(\mathbf{x}) = \sum_i \delta(\mathbf{x} - \mathbf{x}_i)\).

  • boxsize (int or list) – Box size of particle data.

  • modes (int or list) – Number of modes in each dimension in Fourier space.

  • poles (int or list, optional) – Multipole indices \(\ell\).

  • kwargs – Additional keyword arguments passed to finufft.nufft3d1

Returns

  • spectra (list of np.ndarray, shape (nk,)) – Multipoles \(\mathcal{P}_{\ell}(k)\).

  • k (np.ndarray, shape (nk,)) – Wavenumbers \(k\), where kmax = int(max(modes)/2) and nk = kmax+1. The zero mode is returned, so k = np.linspace(0, kmax, nk, endpoint=True).

  • Nk (np.ndarray, shape (nk,)) – Number of points in a wavenumber bin \(N_k\) with shell thickness \([k, k+1)\).

  • Vk (np.ndarray, shape (nk,)) – Volume of a wavenumber bin \(V_k\).

Bispectrum

Implementation using Numba acceleration.

bispectrum(*u, ntheta=None, kmin=None, kmax=None, diagnostics=True, error=False, nsamples=None, sample_thresh=None, compute_fft=True, exclude_upper=False, use_pyfftw=False, bench=False, progress=False, **kwargs)

Estimate the bispectrum \(B(k_1, k_2, \cos\theta)\) and bicoherence index \(b(k_1, k_2, \cos\theta)\) of a real scalar field \(u\) or vector field \(\mathbf{u}\).

Assuming statistical homogeneity, we define the bispectrum \(B(\mathbf{k}_1, \mathbf{k}_2, \mathbf{k}_3)\) as the 3-point correlation function in Fourier space with \(\mathbf{k}_1 + \mathbf{k}_2 + \mathbf{k}_3 = 0\). For a real \(u\)

\[B(\mathbf{k}_1, \mathbf{k}_2, - \mathbf{k}_1 - \mathbf{k}_2) = \langle\tilde{u}(\mathbf{k}_1)\tilde{u}(\mathbf{k}_2) \tilde{u}^{*}(\mathbf{k}_1+\mathbf{k}_2)\rangle,\]

where \(\tilde{u}\) is the Fourier transform of \(u\) and \(\langle\cdot\rangle\) is some choice of averaging. For a vector field, the bispectrum is a 3-tensor

\[B_{ijk}(\mathbf{k}_1, \mathbf{k}_2, - \mathbf{k}_1 - \mathbf{k}_2) = \langle\tilde{u}_i(\mathbf{k}_1)\tilde{u}_j(\mathbf{k}_2) \tilde{u}_k^{*}(\mathbf{k}_1+\mathbf{k}_2)\rangle,\]

where we have a bispectrum for each combination of vector components. Proceeding for the case of a scalar field, if the data is also statistically isotropic, then we can say that \(B = B(k_1, k_2, \hat{\mathbf{k}}_1 \cdot \hat{\mathbf{k}}_2)\).

With \(\hat{\mathbf{k}}_1 \cdot \hat{\mathbf{k}}_2 \equiv \cos\theta\), we compute the bispectrum as

\[B(k_1, k_2, \cos\theta) = \frac{1}{V_1 V_2} \int\int_{\Omega} d^D \mathbf{k}_1 d^D \mathbf{k}_2 \ \tilde{u}(\mathbf{k}_1)\tilde{u}(\mathbf{k}_2) \tilde{u}^{*}(\mathbf{k}_1 + \mathbf{k}_2),\]

and the bicoherence as

\[b(k_1, k_2, \cos\theta) = \frac{ |\int\int_{\Omega} d^D \mathbf{k}_1 d^D \mathbf{k}_2 \ \tilde{u}(\mathbf{k}_1)\tilde{u}(\mathbf{k}_2) \tilde{u}^{*}(\mathbf{k}_1 + \mathbf{k}_2)|}{ \int\int_{\Omega} d^D \mathbf{k}_1 d^D \mathbf{k}_2 \ |\tilde{u}(\mathbf{k}_1)\tilde{u}(\mathbf{k}_2) \tilde{u}^{*}(\mathbf{k}_1 + \mathbf{k}_2)|}.\]

\(\Omega\) is the set of (\(\mathbf{k}_1\), \(\mathbf{k}_2\)) pairs such that \(|\mathbf{k}_1| \in [k_1, k_1+1)\), \(|\mathbf{k}_2| \in [k_2, k_2+1)\), and \(\hat{\mathbf{k}}_1 \cdot \hat{\mathbf{k}}_2 \in [\cos\theta, \cos\theta+\Delta \cos\theta)\). We only consider \((\mathbf{k}_2)_z > 0\) to include (\(\mathbf{k}_1\), \(\mathbf{k}_2\)) contributions but not their complex conjugates from (\(-\mathbf{k}_1\), \(-\mathbf{k}_2\)). The bispectrum defined over \(\Omega\) is real so this improves performance.

To estimate \(B\), we take the average

\[B(k_1, k_2, \cos\theta) = \frac{1}{|\Omega|} \sum\limits_{\Omega} \tilde{u}(\mathbf{k}_1)\tilde{u}(\mathbf{k}_2) \tilde{u}^{*}(\mathbf{k}_1 + \mathbf{k}_2),\]

where now \(\tilde{u}\) is an FFT. For 3D fields, the full sum is often too large to compute. Instead, we compute a naive Monte Carlo integration by drawing \(N\) uniform samples \((\mathbf{k}_1^n, \mathbf{k}_2^n)\) from the set \(\Omega\). This defines an unbiased estimator of \(B\),

\[\hat{B}(k_1, k_2, \cos\theta) = \frac{1}{N} \sum\limits_{n = 1}^{N} \tilde{u}(\mathbf{k}_1^n)\tilde{u}(\mathbf{k}_2^n) \tilde{u}^{*}(\mathbf{k}_1^n + \mathbf{k}_2^n).\]

The same procedure is used to compute \(b\). By default, this implementation returns \(B(k_1, k_2)\), the bispectrum summed over \(\cos\theta\).

Note

When considering the bispectrum as a function of triangle angle, mesh points may be set to np.nan depending on \(k_1, \ k_2\). For example, \(\theta = 0\) would yield np.nan for all \(k_1 + k_2 > \sqrt{2} \ k_{nyq}\), where \(k_{nyq}\) is the Nyquist frequency. Computing a boolean mask with np.isnan and reductions like np.nansum can be useful.

Parameters
  • u (np.ndarray) – Scalar field or vector components to correlate. If vector data, pass arguments as u1, u2, u3, where ui is a vector component. Each ui can be 2D or 3D, and all must have the same ui.shape and ui.dtype.

  • ntheta (int, optional) – Number of angular bins \(\cos\theta\) between triangles formed by wavevectors \(\mathbf{k_1}, \ \mathbf{k_2}\). If None, sum over all triangle angles. Otherwise, return a bispectrum for each angular bin.

  • kmin (int, optional) – Minimum wavenumber in bispectrum calculation. If None, kmin = 1.

  • kmax (int, optional) – Maximum wavenumber in bispectrum calculation. If None, kmax = max(u.shape)//2

  • diagnostics (bool, optional) – Return the optional sampling and normalization diagnostics, documented below. Set error = True to also return the standard error of the mean.

  • error (bool, optional) – Return standard error of the mean of the Monte-Carlo integration. diagnostics = True must also be set. This will add a bit of time to the calculation.

  • nsamples (int, float or np.ndarray, shape (kmax-kmin+1, kmax-kmin+1), optional) – Number of sample triangles or fraction of total possible triangles. This may be an array that specifies for a given \(k_1, \ k_2\). If None, calculate the exact sum. Note that the computation is parallelized over this value, so for sufficiently large nsamples there can be memory errors.

  • sample_thresh (int, optional) – When the size of the sample space is greater than this number, start to use sampling instead of exact calculation. If None, switch to exact calculation when nsamples is less than the size of the sample space.

  • compute_fft (bool, optional) – If False, do not take the FFT of the input data. FFTs should not be passed with the zero-frequency component in the center, and it should have the same shape as the real-space data.

  • exclude_upper (bool, optional) – If True, set points where \(k_1 + k_2 > k_{nyq}\) to np.nan. This keyword only applies when summing over angles, e.g. when ntheta is None.

  • use_pyfftw (bool, optional) – If True, use pyfftw instead of np.fft to compute FFTs.

  • bench (bool, optional) – If True, print calculation time.

  • progress (bool, optional) – Print progress bar of calculation.

  • kwargs – Additional keyword arguments passed to np.fft.fftn or pyfftw.builders.fftn.

Returns

  • B (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1)) – Bispectrum \(B(k_1, k_2, \cos\theta)\).

  • b (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1)) – Bicoherence index \(b(k_1, k_2, \cos\theta)\).

  • kn (np.ndarray, shape (kmax-kmin+1,)) – Left edges of wavenumber bins \(k_1\) or \(k_2\) along axis of bispectrum.

  • theta (np.ndarray, shape (m,), optional) – Left edges of angular bins \(\cos\theta\), ranging from \([-1, \ 1)\).

  • counts (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1), optional) – Number of evaluations in the bispectrum sum, \(N\).

  • omega (np.ndarray, shape (kmax-kmin+1, kmax-kmin+1), optional) – Number of possible triangles in the sample space, \(|\Omega|\). This is implemented for if sampling were not restricted by the Nyquist frequency. Note that this is only implemented for ntheta = None.

  • stderr (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1), optional) – Standard error of the mean for each bin. This can be an error estimate for the Monte Carlo integration. To convert to the standard deviation, evaluate stderr * np.sqrt(counts).

Implementation using CuPy acceleration.

bispectrum(*u, ntheta=None, kmin=None, kmax=None, diagnostics=True, error=False, nsamples=None, sample_thresh=None, compute_fft=True, exclude_upper=False, double=True, blocksize=128, bench=False, progress=False, **kwargs)

See the documentation for the CPU version.

This implementation is written to minimize memory requirements on a single GPU. Set double = False to save memory.

Parameters
  • u (np.ndarray or cp.ndarray) – Scalar field or vector components to correlate. If vector data, pass arguments as u1, u2, u3, where ui is a vector component. Each ui can be 2D or 3D, and all must have the same ui.shape and ui.dtype.

  • ntheta (int, optional) – Number of angular bins \(\cos\theta\) between triangles formed by wavevectors \(\mathbf{k_1}, \ \mathbf{k_2}\). If None, sum over all triangle angles. Otherwise, return a bispectrum for each angular bin.

  • kmin (int, optional) – Minimum wavenumber in bispectrum calculation. If None, kmin = 1.

  • kmax (int, optional) – Maximum wavenumber in bispectrum calculation. If None, kmax = max(u.shape)//2.

  • diagnostics (bool, optional) – Return the optional sampling and normalization diagnostics, documented below. Set error = True to also return the standard error of the mean.

  • error (bool, optional) – Return standard error of the mean of the Monte-Carlo integration. diagnostics = True must also be set. This will add a bit of time to the calculation.

  • nsamples (int, float or np.ndarray, shape (kmax-kmin+1, kmax-kmin+1), optional) – Number of sample triangles or fraction of total possible triangles. This may be an array that specifies for a given \(k_1, \ k_2\). If None, calculate the exact sum.

  • sample_thresh (int, optional) – When the size of the sample space is greater than this number, start to use sampling instead of exact calculation. If None, switch to exact calculation when nsamples is less than the size of the sample space.

  • compute_fft (bool, optional) – If False, do not take the FFT of the input data. FFTs should not be passed with the zero-frequency component in the center.

  • exclude_upper (bool, optional) – If True, set points where \(k_1 + k_2 > k_{nyq}\) to np.nan. This keyword only applies when summing over angles, e.g. when ntheta is None.

  • double (bool, optional) – If False, do calculation in single precision.

  • blocksize (int, optional) – Number of threads per block for GPU kernels. The optimal value will vary depending on hardware.

  • progress (bool, optional) – Print progress bar of calculation.

  • bench (bool, optional) – If True, print calculation time.

  • kwargs – Additional keyword arguments passed to cupyx.scipy.fft.fftn.

Returns

  • B (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1)) – Bispectrum \(B(k_1, k_2, \cos\theta)\).

  • b (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1)) – Bicoherence index \(b(k_1, k_2, \cos\theta)\).

  • kn (np.ndarray, shape (kmax-kmin+1,)) – Wavenumbers \(k_1\) or \(k_2\) along axis of bispectrum.

  • theta (np.ndarray, shape (m,), optional) – Left edges of angular bins \(\cos\theta\), ranging from \([-1, \ 1)\).

  • counts (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1), optional) – Number of evaluations in the bispectrum sum, \(N\).

  • omega (np.ndarray, shape (kmax-kmin+1, kmax-kmin+1), optional) – Number of possible triangles in the sample space, \(|\Omega|\). This is implemented for if sampling were not restricted by the Nyquist frequency. Note that this is only implemented for ntheta = None.

  • stderr (np.ndarray, shape (m, kmax-kmin+1, kmax-kmin+1), optional) – Standard error of the mean for each bin. This can be an error estimate for the Monte Carlo integration. To convert to the standard deviation, evaluate stderr * np.sqrt(counts).

Power spectrum (legacy)

powerspectrum(*u, average=True, diagnostics=False, kmin=None, kmax=None, npts=None, compute_fft=True, compute_sqr=True, use_pyfftw=False, bench=False, **kwargs)

Returns the isotropically averaged power spectrum \(P(k)\) of a real scalar or vector field \(u\). Assuming statistical homogeneity, the power spectrum is the 2-point correlation function in Fourier space with \(\mathbf{k} + \mathbf{k}' = 0\). We define this isotropically averaged power spectrum as

\[P(k) = \int\limits_{|\mathbf{k}| \in [k, \ k+\Delta k)} d^D \mathbf{k} \ |\tilde{u}(\mathbf{k})|^2,\]

where \(\tilde{u}\) is the FFT of \(u\).

We approximate this integral as

\[P(k) = \frac{V_k}{N_k} \sum\limits_{|\mathbf{k}| \in [k, k+\Delta k)} |\tilde{u}(\mathbf{k})|^2,\]

where \(V_k\) and \(N_k\) are the volume and number of points in the \(k^{th}\) bin.

Parameters
  • u (np.ndarray) – Scalar or vector field. If passing vector data, pass arguments as u1, u2, ..., un where ui is the ith vector component. Each ui can be 1D, 2D, or 3D, and all must have the same ui.shape and ui.dtype.

  • average (bool, optional) – If True, average over values in a given bin and multiply by the bin volume. If False, compute the sum.

  • diagnostics (bool, optional) – Return the standard deviation and number of points in a particular radial bin.

  • kmin (int or float, optional) – Minimum wavenumber in power spectrum bins. If None, kmin = 1.

  • kmax (int or float, optional) – Maximum wavenumber in power spectrum bins. If None, kmax = max(u.shape)//2.

  • npts (int, optional) – Number of modes between kmin and kmax, inclusive. If None, npts = kmax-kmin+1.

  • compute_fft (bool, optional) – If False, do not take the FFT of the input data. FFTs should not be passed with the zero-frequency component in the center.

  • compute_sqr (bool, optional) – If False, sum the real part of the FFT. This can be useful for purely real FFTs, where the sign of the FFT is useful information. If True, take the square as usual.

  • use_pyfftw (bool, optional) – If True, use pyfftw instead of np.fft to compute FFTs.

  • bench (bool, optional) – Print message for time of calculation.

  • kwargs – Additional keyword arguments passed to np.fft.fftn, np.fft.rfftn, pyfftw.builders.fftn, or pyfftw.builders.rfftn.

Returns

  • spectrum (np.ndarray, shape (npts,)) – Radially averaged power spectrum \(P(k)\).

  • kn (np.ndarray, shape (npts,)) – Left edges of radial bins \(k\).

  • counts (np.ndarray, shape (npts,), optional) – Number of points \(N_k\) in each bin.

  • vol (np.ndarray, shape (npts,), optional) – Volume \(V_k\) of each bin.

  • stdev (np.ndarray, shape (npts,), optional) – Standard deviation multiplied with \(V_k\) in each bin.

Implementation using CuPy acceleration.

powerspectrum(*u, average=True, diagnostics=False, kmin=None, kmax=None, npts=None, compute_fft=True, compute_sqr=True, double=True, bench=False, **kwargs)

See the documentation for the CPU version.

Parameters
  • u (np.ndarray) – Scalar or vector field. If vector data, pass arguments as u1, u2, ..., un where ui is the ith vector component. Each ui can be 1D, 2D, or 3D, and all must have the same ui.shape and ui.dtype.

  • average (bool, optional) – If True, average over values in a given bin and multiply by the bin volume. If False, compute the sum.

  • diagnostics (bool, optional) – Return the standard deviation and number of points in a particular radial bin.

  • kmin (int or float, optional) – Minimum wavenumber in power spectrum bins. If None, kmin = 1.

  • kmax (int or float, optional) – Maximum wavenumber in power spectrum bins. If None, kmax = max(u.shape)//2.

  • npts (int, optional) – Number of modes between kmin and kmax, inclusive. If None, npts = kmax-kmin+1.

  • compute_fft (bool, optional) – If False, do not take the FFT of the input data. FFTs should not be passed with the zero-frequency component in the center.

  • compute_sqr (bool, optional) – If False, sum the real part of the FFT. This can be useful for purely real FFTs, where the sign of the FFT is useful information. If True, take the square as usual.

  • double (bool, optional) – If False, calculate FFTs in single precision. Useful for saving memory.

  • bench (bool, optional) – Print message for time of calculation.

  • kwargs – Additional keyword arguments passed to cupyx.scipy.fft.fftn or cupyx.scipy.fft.rfftn.

Returns

  • spectrum (np.ndarray, shape (npts,)) – Radially averaged power spectrum \(P(k)\).

  • kn (np.ndarray, shape (npts,)) – Left edges of radial bins \(k\).

  • counts (np.ndarray, shape (npts,), optional) – Number of points \(N_k\) in each bin.

  • vol (np.ndarray, shape (npts,), optional) – Volume \(V_k\) of each bin.

  • stdev (np.ndarray, shape (npts,), optional) – Standard deviation multiplied with \(V_k\) in each bin.