pyrolite.plot.density

Kernel desnity estimation plots for geochemical data.

pyrolite.plot.density.density(arr, ax=None, logx=False, logy=False, bins=25, mode='density', extent=None, contours=[], percentiles=True, relim=True, cmap=<matplotlib.colors.ListedColormap object>, shading='auto', vmin=0.0, colorbar=False, **kwargs)[source]

Creates diagramatic representation of data density and/or frequency for either binary diagrams (X-Y) or ternary plots. Additional arguments are typically forwarded to respective matplotlib functions pcolormesh(), hist2d(), hexbin(), contour(), and contourf() (see Other Parameters, below).

Parameters
  • arr (numpy.ndarray) – Dataframe from which to draw data.

  • ax (matplotlib.axes.Axes, None) – The subplot to draw on.

  • logx (bool, False) – Whether to use a logspaced grid on the x axis. Values strictly >0 required.

  • logy (bool, False) – Whether to use a logspaced grid on the y axis. Values strictly >0 required.

  • bins (int, 20) – Number of bins used in the gridded functions (histograms, KDE evaluation grid).

  • mode (str, ‘density’) – Different modes used here: [‘density’, ‘hexbin’, ‘hist2d’]

  • extent (list) – Predetermined extent of the grid for which to from the histogram/KDE. In the general form (xmin, xmax, ymin, ymax).

  • contours (list) – Contours to add to the plot, where mode='density' is used.

  • percentiles (bool, True) – Whether contours specified are to be converted to percentiles.

  • relim (bool, True) – Whether to relimit the plot based on xmin, xmax values.

  • cmap (matplotlib.colors.Colormap) – Colormap for mapping surfaces.

  • vmin (float, 0.) – Minimum value for colormap.

  • shading (str, ‘auto’) – Shading to apply to pcolormesh.

  • colorbar (bool, False) – Whether to append a linked colorbar to the generated mappable image.

Note

The following additional parameters are from matplotlib.pyplot.pcolormesh().

Other Parameters
  • C (array-like) – The mesh data. Supported array shapes are:

    • (M, N) or M*N: a mesh with scalar data. The values are mapped to colors using normalization and a colormap. See parameters norm, cmap, vmin, vmax.

    • (M, N, 3): an image with RGB values (0-1 float or 0-255 int).

    • (M, N, 4): an image with RGBA values (0-1 float or 0-255 int), i.e. including transparency.

    The first two dimensions (M, N) define the rows and columns of the mesh data.

  • X, Y (array-like, optional) –

    The coordinates of the corners of quadrilaterals of a pcolormesh:

    (X[i+1, j], Y[i+1, j])       (X[i+1, j+1], Y[i+1, j+1])
                          ●╶───╴●
                          │     │
                          ●╶───╴●
        (X[i, j], Y[i, j])       (X[i, j+1], Y[i, j+1])
    

    Note that the column index corresponds to the x-coordinate, and the row index corresponds to y. For details, see the Notes section below.

    If shading='flat' the dimensions of X and Y should be one greater than those of C, and the quadrilateral is colored due to the value at C[i, j]. If X, Y and C have equal dimensions, a warning will be raised and the last row and column of C will be ignored.

    If shading='nearest' or 'gouraud', the dimensions of X and Y should be the same as those of C (if not, a ValueError will be raised). For 'nearest' the color C[i, j] is centered on (X[i, j], Y[i, j]). For 'gouraud', a smooth interpolation is caried out between the quadrilateral corners.

    If X and/or Y are 1-D arrays or column vectors they will be expanded as needed into the appropriate 2D arrays, making a rectangular grid.

  • norm (str or ~matplotlib.colors.Normalize, optional) – The normalization method used to scale scalar data to the [0, 1] range before mapping to colors using cmap. By default, a linear scaling is used, mapping the lowest value to 0 and the highest to 1.

    If given, this can be one of the following:

    • An instance of .Normalize or one of its subclasses (see Colormap normalization).

    • A scale name, i.e. one of “linear”, “log”, “symlog”, “logit”, etc. For a list of available scales, call matplotlib.scale.get_scale_names(). In that case, a suitable .Normalize subclass is dynamically generated and instantiated.

  • vmin, vmax (float, optional) – When using scalar data and no explicit norm, vmin and vmax define the data range that the colormap covers. By default, the colormap covers the complete value range of the supplied data. It is an error to use vmin/vmax when a norm instance is given (but using a str norm name together with vmin/vmax is acceptable).

  • edgecolors ({‘none’, None, ‘face’, color, color sequence}, optional) – The color of the edges. Defaults to ‘none’. Possible values:

    The singular form edgecolor works as an alias.

  • alpha (float, default: None) – The alpha blending value, between 0 (transparent) and 1 (opaque).

  • snap (bool, default: False) – Whether to snap the mesh to pixel boundaries.

  • rasterized (bool, optional) – Rasterize the pcolormesh when drawing vector graphics. This can speed up rendering and produce smaller files for large data sets. See also /gallery/misc/rasterization_demo.

Note

The following additional parameters are from matplotlib.pyplot.hist2d().

Other Parameters
  • x, y (array-like, shape (n, )) – Input values

  • range (array-like shape(2, 2), optional) – The leftmost and rightmost edges of the bins along each dimension (if not specified explicitly in the bins parameters): [[xmin, xmax], [ymin, ymax]]. All values outside of this range will be considered outliers and not tallied in the histogram.

  • density (bool, default: False) – Normalize histogram. See the documentation for the density parameter of ~.Axes.hist for more details.

  • weights (array-like, shape (n, ), optional) – An array of values w_i weighing each sample (x_i, y_i).

  • cmin, cmax (float, default: None) – All bins that has count less than cmin or more than cmax will not be displayed (set to NaN before passing to ~.Axes.pcolormesh) and these count values in the return value count histogram will also be set to nan upon return.

Note

The following additional parameters are from matplotlib.pyplot.hexbin().

Other Parameters
  • x, y (array-like) – The data positions. x and y must be of the same length.

  • C (array-like, optional) – If given, these values are accumulated in the bins. Otherwise, every point has a value of 1. Must be of the same length as x and y.

  • gridsize (int or (int, int), default: 100) – If a single int, the number of hexagons in the x-direction. The number of hexagons in the y-direction is chosen such that the hexagons are approximately regular.

    Alternatively, if a tuple (nx, ny), the number of hexagons in the x-direction and the y-direction. In the y-direction, counting is done along vertically aligned hexagons, not along the zig-zag chains of hexagons; see the following illustration.

    To get approximately regular hexagons, choose \(n_x = \sqrt{3}\,n_y\).

  • xscale ({‘linear’, ‘log’}, default: ‘linear’) – Use a linear or log10 scale on the horizontal axis.

  • yscale ({‘linear’, ‘log’}, default: ‘linear’) – Use a linear or log10 scale on the vertical axis.

  • mincnt (int >= 0, default: *None*) – If not None, only display cells with at least mincnt number of points in the cell.

  • marginals (bool, default: *False*) – If marginals is True, plot the marginal density as colormapped rectangles along the bottom of the x-axis and left of the y-axis.

Note

The following additional parameters are from matplotlib.pyplot.contour().

Other Parameters
  • X, Y (array-like, optional) – The coordinates of the values in Z.

    X and Y must both be 2D with the same shape as Z (e.g. created via numpy.meshgrid), or they must both be 1-D such that len(X) == N is the number of columns in Z and len(Y) == M is the number of rows in Z.

    X and Y must both be ordered monotonically.

    If not given, they are assumed to be integer indices, i.e. X = range(N), Y = range(M).

  • Z ((M, N) array-like) – The height values over which the contour is drawn. Color-mapping is controlled by cmap, norm, vmin, and vmax.

  • levels (int or array-like, optional) – Determines the number and positions of the contour lines / regions.

    If an int n, use ~matplotlib.ticker.MaxNLocator, which tries to automatically choose no more than n+1 “nice” contour levels between minimum and maximum numeric values of Z.

    If array-like, draw contour lines at the specified levels. The values must be in increasing order.

Note

The following additional parameters are from matplotlib.pyplot.contourf().

Other Parameters
  • X, Y (array-like, optional) – The coordinates of the values in Z.

    X and Y must both be 2D with the same shape as Z (e.g. created via numpy.meshgrid), or they must both be 1-D such that len(X) == N is the number of columns in Z and len(Y) == M is the number of rows in Z.

    X and Y must both be ordered monotonically.

    If not given, they are assumed to be integer indices, i.e. X = range(N), Y = range(M).

  • Z ((M, N) array-like) – The height values over which the contour is drawn. Color-mapping is controlled by cmap, norm, vmin, and vmax.

  • levels (int or array-like, optional) – Determines the number and positions of the contour lines / regions.

    If an int n, use ~matplotlib.ticker.MaxNLocator, which tries to automatically choose no more than n+1 “nice” contour levels between minimum and maximum numeric values of Z.

    If array-like, draw contour lines at the specified levels. The values must be in increasing order.

Returns

Notes

The default density estimates and derived contours are generated based on kernel density estimates. Assumptions around e.g. 95% of points lying within a 95% contour won’t necessarily be valid for non-normally distributed data (instead, this represents the approximate 95% percentile on the kernel density estimate). Note that contours are currently only generated; for mode=”density”; future updates may allow the use of a histogram basis, which would give results closer to 95% data percentiles.

Todo

  • Allow generation of contours from histogram data, rather than just

    the kernel density estimate.

  • Implement an option and filter to ‘scatter’ points below the minimum threshold

    or maximum percentile contours.