pyrolite.util.missing

pyrolite.util.missing.md_pattern(Y)[source]

Get the missing data patterns from an array.

Parameters

Y (numpy.ndarray | pandas.DataFrame) – Input dataset.

Returns

  • pattern_ids (numpy.ndarray) – Pattern ID array.

  • pattern_dict (dict) – Dictionary of patterns indexed by pattern IDs. Contains a pattern and count for each pattern ID.

pyrolite.util.missing.cooccurence_pattern(Y, normalize=False, log=False)[source]

Get the co-occurence patterns from an array.

Parameters
  • Y (numpy.ndarray | pandas.DataFrame) – Input dataset.

  • normalize (bool) – Whether to normalize the cooccurence to compare disparate variables.

  • log (bool) – Whether to take the log of the cooccurence.

Returns

co_occur – Cooccurence frequency array.

Return type

numpy.ndarray