pyrolite.comp.codata

pyrolite.comp.codata.close(X: ndarray, sumf=<function sum>)[source]

Closure operator for compositional data.

Parameters:
Returns:

Closed array.

Return type:

numpy.ndarray

Notes

Checks for non-positive entries and replaces zeros with NaN values.

pyrolite.comp.codata.renormalise(df: DataFrame, components: list = [], scale=100.0)[source]

Renormalises compositional data to ensure closure.

Parameters:
  • df (pandas.DataFrame) – Dataframe to renomalise.

  • components (list) – Option subcompositon to renormalise to 100. Useful for the use case where compostional data and non-compositional data are stored in the same dataframe.

  • scale (float, 100.) – Closure parameter. Typically either 100 or 1.

Returns:

Renormalized dataframe.

Return type:

pandas.DataFrame

pyrolite.comp.codata.ALR(X: ndarray, ind: int = -1, null_col=False)[source]

Additive Log Ratio transformation.

Parameters:
  • X (numpy.ndarray) – Array on which to perform the transformation, of shape (N, D).

  • ind (int) – Index of column used as denominator.

  • null_col (bool) – Whether to keep the redundant column.

Returns:

ALR-transformed array, of shape (N, D-1).

Return type:

numpy.ndarray

pyrolite.comp.codata.inverse_ALR(Y: ndarray, ind=-1, null_col=False)[source]

Inverse Centred Log Ratio transformation.

Parameters:
  • Y (numpy.ndarray) – Array on which to perform the inverse transformation, of shape (N, D-1).

  • ind (int) – Index of column used as denominator.

  • null_col (bool, False) – Whether the array contains an extra redundant column (i.e. shape is (N, D)).

Returns:

Inverse-ALR transformed array, of shape (N, D).

Return type:

numpy.ndarray

pyrolite.comp.codata.CLR(X: ndarray)[source]

Centred Log Ratio transformation.

Parameters:

X (numpy.ndarray) – 2D array on which to perform the transformation, of shape (N, D).

Returns:

CLR-transformed array, of shape (N, D).

Return type:

numpy.ndarray

pyrolite.comp.codata.inverse_CLR(Y: ndarray)[source]

Inverse Centred Log Ratio transformation.

Parameters:

Y (numpy.ndarray) – Array on which to perform the inverse transformation, of shape (N, D).

Returns:

Inverse-CLR transformed array, of shape (N, D).

Return type:

numpy.ndarray

pyrolite.comp.codata.ILR(X: ndarray, psi=None, **kwargs)[source]

Isometric Log Ratio transformation.

Parameters:
  • X (numpy.ndarray) – Array on which to perform the transformation, of shape (N, D).

  • psi (numpy.ndarray) – Array or matrix representing the ILR basis; optionally specified.

Returns:

ILR-transformed array, of shape (N, D-1).

Return type:

numpy.ndarray

pyrolite.comp.codata.inverse_ILR(Y: ndarray, X: ndarray = None, psi=None, **kwargs)[source]

Inverse Isometric Log Ratio transformation.

Parameters:
  • Y (numpy.ndarray) – Array on which to perform the inverse transformation, of shape (N, D-1).

  • X (numpy.ndarray, None) – Optional specification for an array from which to derive the orthonormal basis, with shape (N, D).

  • psi (numpy.ndarray) – Array or matrix representing the ILR basis; optionally specified.

Returns:

Inverse-ILR transformed array, of shape (N, D).

Return type:

numpy.ndarray

pyrolite.comp.codata.logratiomean(df, transform=<function CLR>)[source]

Take a mean of log-ratios along the index of a dataframe.

Parameters:
  • df (pandas.DataFrame) – Dataframe from which to compute a mean along the index.

  • transform (callable) – Log transform to use.

  • inverse_transform (callable) – Inverse of log transform.

Returns:

Mean values as a pandas series.

Return type:

pandas.Series

pyrolite.comp.codata.get_ALR_labels(df, mode='simple', ind=-1, **kwargs)[source]

Get symbolic labels for ALR coordinates based on dataframe columns.

Parameters:
  • df (pandas.DataFrame) – Dataframe to generate ALR labels for.

  • mode (str) – Mode of label to return (LaTeX, simple).

Returns:

List of ALR coordinates corresponding to dataframe columns.

Return type:

list

Notes

Some variable names are protected in sympy and if used can result in errors. If one of these column names is found, it will be replaced with a title-cased duplicated version of itself (e.g. ‘S’ will be replaced by ‘Ss’).

pyrolite.comp.codata.get_CLR_labels(df, mode='simple', **kwargs)[source]

Get symbolic labels for CLR coordinates based on dataframe columns.

Parameters:
  • df (pandas.DataFrame) – Dataframe to generate CLR labels for.

  • mode (str) – Mode of label to return (LaTeX, simple).

Returns:

List of CLR coordinates corresponding to dataframe columns.

Return type:

list

Notes

Some variable names are protected in sympy and if used can result in errors. If one of these column names is found, it will be replaced with a title-cased duplicated version of itself (e.g. ‘S’ will be replaced by ‘Ss’).

pyrolite.comp.codata.get_ILR_labels(df, mode='latex', **kwargs)[source]

Get symbolic labels for ILR coordinates based on dataframe columns.

Parameters:
  • df (pandas.DataFrame) – Dataframe to generate ILR labels for.

  • mode (str) – Mode of label to return (LaTeX, simple).

Returns:

List of ILR coordinates corresponding to dataframe columns.

Return type:

list

Notes

Some variable names are protected in sympy and if used can result in errors. If one of these column names is found, it will be replaced with a title-cased duplicated version of itself (e.g. ‘S’ will be replaced by ‘Ss’).

pyrolite.comp.codata.boxcox(X: ndarray, lmbda=None, lmbda_search_space=(-1, 5), search_steps=100, return_lmbda=False)[source]

Box-Cox transformation.

Parameters:
  • X (numpy.ndarray) – Array on which to perform the transformation.

  • lmbda (numpy.number, None) – Lambda value used to forward-transform values. If none, it will be calculated using the mean.

  • lmbda_search_space (tuple) – Range tuple (min, max).

  • search_steps (int) – Steps for lambda search range.

  • return_lmbda (bool) – Whether to also return the lambda value.

Returns:

Box-Cox transformed array. If return_lmbda is true, tuple contains data and lambda value.

Return type:

numpy.ndarray | numpy.ndarray`(:class:`float)

pyrolite.comp.codata.inverse_boxcox(Y: ndarray, lmbda)[source]

Inverse Box-Cox transformation.

Parameters:
  • Y (numpy.ndarray) – Array on which to perform the transformation.

  • lmbda (float) – Lambda value used to forward-transform values.

Returns:

Inverse Box-Cox transformed array.

Return type:

numpy.ndarray

pyrolite.comp.codata.sphere(ys)[source]

Spherical coordinate transformation for compositional data.

Parameters:

ys (numpy.ndarray) – Compositional data to transform (shape (n, D)).

Returns:

θ – Array of angles in radians (\((0, \pi / 2]\))

Return type:

numpy.ndarray

Notes

numpy.arccos() will return angles in the range \((0, \pi)\). This shouldn’t be an issue for this function given that the input values are all positive.

pyrolite.comp.codata.inverse_sphere(θ)[source]

Inverse spherical coordinate transformation to revert back to compositional data in the simplex.

Parameters:

θ (numpy.ndarray) – Angular coordinates to revert.

Returns:

ys – Compositional (simplex) coordinates, normalised to 1.

Return type:

numpy.ndarray

pyrolite.comp.codata.compositional_cosine_distances(arr)[source]

Calculate a distance matrix corresponding to the angles between a number of compositional vectors.

Parameters:

arr (numpy.ndarray) – Array of n-dimensional compositions of shape (n_samples, n).

Returns:

Array of angular distances of shape (n_samples, n_samples).

Return type:

numpy.ndarray

pyrolite.comp.codata.get_transforms(name)[source]

Lookup a transform-inverse transform pair by name.

Parameters:

name (str) – Name of of the transform pairs (e.g. :code:'CLR').

Returns:

tfm, inv_tfm – Transform and inverse transform functions.

Return type:

callable