Unit Scaling

import numpy as np
import pandas as pd

import pyrolite.geochem

pd.set_option("display.precision", 3)  # smaller outputs

Here we create an example dataframe to work from, containing some synthetic data in the form of oxides and elements, each with different units (wt% and ppm, respectively).

from pyrolite.util.synthetic import normal_frame

df = normal_frame(
    columns=["CaO", "MgO", "SiO2", "FeO", "Ni", "Ti", "La", "Lu"], seed=22
)
df.pyrochem.oxides *= 100  # oxides in wt%
df.pyrochem.elements *= 10000  # elements in ppm

In this case, we might want to transform the Ni and Ti into their standard oxide equivalents NiO and TiO2:

df.pyrochem.convert_chemistry(to=["NiO", "TiO2"]).head(2)
/home/docs/checkouts/readthedocs.org/user_builds/pyrolite/checkouts/main/pyrolite/geochem/transform.py:351: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1235.19505234 1301.09283284 1297.67084898 1332.78130758 1231.28287236
 1223.09485893 1213.13011719 1220.86333288 1269.99495233 1193.1419757 ]' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  _df.loc[:, targetnames] = subsum.values[:, np.newaxis] @ coeff[np.newaxis, :]
/home/docs/checkouts/readthedocs.org/user_builds/pyrolite/checkouts/main/pyrolite/geochem/transform.py:351: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[2388.2200631  2164.62285792 2286.2934774  2443.58980729 2245.75825742
 2269.68007197 2130.01547469 2282.00522338 2257.57443799 2515.3361174 ]' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  _df.loc[:, targetnames] = subsum.values[:, np.newaxis] @ coeff[np.newaxis, :]
NiO TiO2
0 1235.195 2388.220
1 1301.093 2164.623


But here because Ni and Ti have units of ppm, the results are a little non-sensical, especially when it’s combined with the other oxides:

/home/docs/checkouts/readthedocs.org/user_builds/pyrolite/checkouts/main/pyrolite/geochem/transform.py:351: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[1235.19505234 1301.09283284 1297.67084898 1332.78130758 1231.28287236
 1223.09485893 1213.13011719 1220.86333288 1269.99495233 1193.1419757 ]' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  _df.loc[:, targetnames] = subsum.values[:, np.newaxis] @ coeff[np.newaxis, :]
/home/docs/checkouts/readthedocs.org/user_builds/pyrolite/checkouts/main/pyrolite/geochem/transform.py:351: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[2388.2200631  2164.62285792 2286.2934774  2443.58980729 2245.75825742
 2269.68007197 2130.01547469 2282.00522338 2257.57443799 2515.3361174 ]' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  _df.loc[:, targetnames] = subsum.values[:, np.newaxis] @ coeff[np.newaxis, :]
CaO MgO SiO2 FeO NiO TiO2
0 4.887 5.527 33.705 3.236 1235.195 2388.220
1 5.323 6.056 33.733 2.760 1301.093 2164.623


There are multiple ways we could convert the units, but here we’re going to first convert the elemental ppm data to wt%, then perform our oxide-element conversion. To do this, we’ll use the built-in function scale():

from pyrolite.util.units import scale

df.pyrochem.elements *= scale("ppm", "wt%")

We can see that this then gives us numbers which are a bit more sensible:

/home/docs/checkouts/readthedocs.org/user_builds/pyrolite/checkouts/main/pyrolite/geochem/transform.py:351: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[0.12351951 0.13010928 0.12976708 0.13327813 0.12312829 0.12230949
 0.12131301 0.12208633 0.1269995  0.1193142 ]' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  _df.loc[:, targetnames] = subsum.values[:, np.newaxis] @ coeff[np.newaxis, :]
/home/docs/checkouts/readthedocs.org/user_builds/pyrolite/checkouts/main/pyrolite/geochem/transform.py:351: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[0.23882201 0.21646229 0.22862935 0.24435898 0.22457583 0.22696801
 0.21300155 0.22820052 0.22575744 0.25153361]' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  _df.loc[:, targetnames] = subsum.values[:, np.newaxis] @ coeff[np.newaxis, :]
CaO MgO SiO2 FeO NiO TiO2
0 4.887 5.527 33.705 3.236 0.124 0.239
1 5.323 6.056 33.733 2.760 0.130 0.216


Dealing with Units in Column Names

Often our dataframes will start containing column names which pyrolite doesn’t recognize natively by default (work in progress, this is an item on the roadmap). Here we can create an example of that, and go through some key steps for using this data in pyrolite:

df = normal_frame(
    columns=["CaO", "MgO", "SiO2", "FeO", "Ni", "Ti", "La", "Lu"], seed=22
)
df.pyrochem.oxides *= 100  # oxides in wt%
df.pyrochem.elements *= 10000  # elements in ppm
df = df.rename(
    columns={
        **{c: c + "_wt%" for c in df.pyrochem.oxides},
        **{c: c + "_ppm" for c in df.pyrochem.elements},
    }
)
df.head(2)
CaO_wt% MgO_wt% SiO2_wt% FeO_wt% Ni_ppm Ti_ppm La_ppm Lu_ppm
0 4.887 5.527 33.705 3.236 970.613 1431.363 2454.951 407.684
1 5.323 6.056 33.733 2.760 1022.395 1297.351 2543.004 350.085


If you just wanted to rescale some columns, you can get away without renaming your columns, e.g. converting all of the ppm columns to wt%:

df.loc[:, [c for c in df.columns if "_ppm" in c]] *= scale("ppm", "wt%")
df.head(2)
CaO_wt% MgO_wt% SiO2_wt% FeO_wt% Ni_ppm Ti_ppm La_ppm Lu_ppm
0 4.887 5.527 33.705 3.236 0.097 0.143 0.245 0.041
1 5.323 6.056 33.733 2.760 0.102 0.130 0.254 0.035


However, to access the full native capability of pyrolite, we’d need to rename these columns to use things like convert_chemistry():

units = {  # keep a copy of the units, we can use these to map back later
    c: c[c.find("_") + 1 :] if "_" in c else None for c in df.columns
}
df = df.rename(
    columns={c: c.replace("_wt%", "").replace("_ppm", "") for c in df.columns}
)
df.head(2)
CaO MgO SiO2 FeO Ni Ti La Lu
0 4.887 5.527 33.705 3.236 0.097 0.143 0.245 0.041
1 5.323 6.056 33.733 2.760 0.102 0.130 0.254 0.035


We could then perform our chemistry conversion, rename our columns to include units, and optionally export to e.g. CSV:

/home/docs/checkouts/readthedocs.org/user_builds/pyrolite/checkouts/main/pyrolite/geochem/transform.py:351: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[0.12351951 0.13010928 0.12976708 0.13327813 0.12312829 0.12230949
 0.12131301 0.12208633 0.1269995  0.1193142 ]' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  _df.loc[:, targetnames] = subsum.values[:, np.newaxis] @ coeff[np.newaxis, :]
/home/docs/checkouts/readthedocs.org/user_builds/pyrolite/checkouts/main/pyrolite/geochem/transform.py:351: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '[0.23882201 0.21646229 0.22862935 0.24435898 0.22457583 0.22696801
 0.21300155 0.22820052 0.22575744 0.25153361]' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  _df.loc[:, targetnames] = subsum.values[:, np.newaxis] @ coeff[np.newaxis, :]
CaO MgO SiO2 FeO NiO TiO2
0 4.887 5.527 33.705 3.236 0.124 0.239
1 5.323 6.056 33.733 2.760 0.130 0.216


Here we rename the columns before we export them, just so we know explicitly what the units are:

CaO_wt% MgO_wt% SiO2_wt% FeO_wt% NiO_wt% TiO2_wt%
0 4.887 5.527 33.705 3.236 0.124 0.239
1 5.323 6.056 33.733 2.760 0.130 0.216


converted_wt_pct.to_csv("converted_wt_pct.csv")

Total running time of the script: (0 minutes 0.330 seconds)