pyrolite.comp.codata

pyrolite.comp.codata.close(X: ~numpy.ndarray, sumf=<function sum>)[source]

Closure operator for compositional data.

Parameters

X (numpy.ndarray) – Array to close.

sumf (callable, numpy.sum()) – Sum function to use for closure.

Returns

Closed array.

Return type

numpy.ndarray

Notes

Does not check for non-positive entries.

pyrolite.comp.codata.renormalise(df: DataFrame, components: list = [], scale=100.0)[source]

Renormalises compositional data to ensure closure.

Parameters

df (pandas.DataFrame) – Dataframe to renomalise.

components (list) – Option subcompositon to renormalise to 100. Useful for the use case where compostional data and non-compositional data are stored in the same dataframe.

scale (float, 100.) – Closure parameter. Typically either 100 or 1.

Returns

Renormalized dataframe.

Return type

pandas.DataFrame

pyrolite.comp.codata.ALR(X: ndarray, ind: int = -1, null_col=False)[source]

Additive Log Ratio transformation.

Parameters

X (numpy.ndarray) – Array on which to perform the transformation, of shape (N, D).

ind (int) – Index of column used as denominator.

null_col (bool) – Whether to keep the redundant column.

Returns

ALR-transformed array, of shape (N, D-1).

Return type

numpy.ndarray

pyrolite.comp.codata.inverse_ALR(Y: ndarray, ind=-1, null_col=False)[source]

Inverse Centred Log Ratio transformation.

Parameters

Y (numpy.ndarray) – Array on which to perform the inverse transformation, of shape (N, D-1).

ind (int) – Index of column used as denominator.

null_col (bool, False) – Whether the array contains an extra redundant column (i.e. shape is (N, D)).

Returns

Inverse-ALR transformed array, of shape (N, D).

Return type

numpy.ndarray

pyrolite.comp.codata.CLR(X: ndarray)[source]

Centred Log Ratio transformation.

Parameters

X (numpy.ndarray) – 2D array on which to perform the transformation, of shape (N, D).

Returns

CLR-transformed array, of shape (N, D).

Return type

numpy.ndarray

pyrolite.comp.codata.inverse_CLR(Y: ndarray)[source]

Inverse Centred Log Ratio transformation.

Parameters

Y (numpy.ndarray) – Array on which to perform the inverse transformation, of shape (N, D).

Returns

Inverse-CLR transformed array, of shape (N, D).

Return type

numpy.ndarray

pyrolite.comp.codata.ILR(X: ndarray, psi=None, **kwargs)[source]

Isometric Log Ratio transformation.

Parameters

X (numpy.ndarray) – Array on which to perform the transformation, of shape (N, D).

psi (numpy.ndarray) – Array or matrix representing the ILR basis; optionally specified.

Returns

ILR-transformed array, of shape (N, D-1).

Return type

numpy.ndarray

pyrolite.comp.codata.inverse_ILR(Y: ndarray, X: Optional[ndarray] = None, psi=None, **kwargs)[source]

Inverse Isometric Log Ratio transformation.

Parameters

Y (numpy.ndarray) – Array on which to perform the inverse transformation, of shape (N, D-1).

X (numpy.ndarray, None) – Optional specification for an array from which to derive the orthonormal basis, with shape (N, D).

psi (numpy.ndarray) – Array or matrix representing the ILR basis; optionally specified.

Returns

Inverse-ILR transformed array, of shape (N, D).

Return type

numpy.ndarray

pyrolite.comp.codata.logratiomean(df, transform=<function CLR>)[source]

Take a mean of log-ratios along the index of a dataframe.

Parameters

df (pandas.DataFrame) – Dataframe from which to compute a mean along the index.

transform (callable) – Log transform to use.

inverse_transform (callable) – Inverse of log transform.

Returns

Mean values as a pandas series.

Return type

pandas.Series

pyrolite.comp.codata.get_ALR_labels(df, mode='simple', ind=-1, **kwargs)[source]

Get symbolic labels for ALR coordinates based on dataframe columns.

Parameters

df (pandas.DataFrame) – Dataframe to generate ALR labels for.

mode (str) – Mode of label to return (LaTeX, simple).

Returns

List of ALR coordinates corresponding to dataframe columns.

Return type

list

Notes

Some variable names are protected in sympy and if used can result in errors. If one of these column names is found, it will be replaced with a title-cased duplicated version of itself (e.g. ‘S’ will be replaced by ‘Ss’).

pyrolite.comp.codata.get_CLR_labels(df, mode='simple', **kwargs)[source]

Get symbolic labels for CLR coordinates based on dataframe columns.

Parameters

df (pandas.DataFrame) – Dataframe to generate CLR labels for.

mode (str) – Mode of label to return (LaTeX, simple).

Returns

List of CLR coordinates corresponding to dataframe columns.

Return type

list

Notes

Some variable names are protected in sympy and if used can result in errors. If one of these column names is found, it will be replaced with a title-cased duplicated version of itself (e.g. ‘S’ will be replaced by ‘Ss’).

pyrolite.comp.codata.get_ILR_labels(df, mode='latex', **kwargs)[source]

Get symbolic labels for ILR coordinates based on dataframe columns.

Parameters

df (pandas.DataFrame) – Dataframe to generate ILR labels for.

mode (str) – Mode of label to return (LaTeX, simple).

Returns

List of ILR coordinates corresponding to dataframe columns.

Return type

list

Notes

Some variable names are protected in sympy and if used can result in errors. If one of these column names is found, it will be replaced with a title-cased duplicated version of itself (e.g. ‘S’ will be replaced by ‘Ss’).

pyrolite.comp.codata.boxcox(X: ndarray, lmbda=None, lmbda_search_space=(-1, 5), search_steps=100, return_lmbda=False)[source]

Box-Cox transformation.

Parameters

X (numpy.ndarray) – Array on which to perform the transformation.

lmbda (numpy.number, None) – Lambda value used to forward-transform values. If none, it will be calculated using the mean

lmbda_search_space (tuple) – Range tuple (min, max).

search_steps (int) – Steps for lambda search range.

return_lmbda (bool) – Whether to also return the lambda value.

Returns

Box-Cox transformed array. If return_lmbda is true, tuple contains data and lambda value.

Return type

numpy.ndarray | numpy.ndarray`(:class:`float)

pyrolite.comp.codata.inverse_boxcox(Y: ndarray, lmbda)[source]

Inverse Box-Cox transformation.

Parameters

Y (numpy.ndarray) – Array on which to perform the transformation.

lmbda (float) – Lambda value used to forward-transform values.

Returns

Inverse Box-Cox transformed array.

Return type

numpy.ndarray

pyrolite.comp.codata.sphere(ys)[source]

Spherical coordinate transformation for compositional data.

Parameters

ys (numpy.ndarray) – Compositional data to transform (shape (n, D)).

Returns

θ – Array of angles in radians (\((0, \pi / 2]\))

Return type

numpy.ndarray

Notes

numpy.arccos() will return angles in the range \((0, \pi)\). This shouldn’t be an issue for this function given that the input values are all positive.

pyrolite.comp.codata.inverse_sphere(θ)[source]

Inverse spherical coordinate transformation to revert back to compositional data in the simplex.

Parameters

θ (numpy.ndarray) – Angular coordinates to revert.

Returns

ys – Compositional (simplex) coordinates, normalised to 1.

Return type

numpy.ndarray

pyrolite.comp.codata.compositional_cosine_distances(arr)[source]

Calculate a distance matrix corresponding to the angles between a number of compositional vectors.

Parameters

arr (numpy.ndarray) – Array of n-dimensional compositions of shape (n_samples, n).

Returns

Array of angular distances of shape (n_samples, n_samples).

Return type

numpy.ndarray

pyrolite.comp.codata.get_transforms(name)[source]

Lookup a transform-inverse transform pair by name.

Parameters

name (str) – Name of of the transform pairs (e.g. :code:'CLR').

Returns

tfm, inv_tfm – Transform and inverse transform functions.

Return type

callable