pept.utilities.find_minpoints#

pept.utilities.find_minpoints(const double[:, :] sample_lines, Py_ssize_t num_lines, double max_distance, const double[:] cutoffs, bool append_indices=0)#

Compute the minimum distance points (MDPs) from all combinations of num_lines lines given in an array of lines sample_lines.

Function signature:
    find_minpoints(
        double[:, :] sample_lines,  # LoRs in sample
        Py_ssize_t num_lines,       # Number of LoRs in combinations
        double max_distance,        # Max distance from MDP to LoRs
        double[:] cutoffs,          # Spatial cutoff for minpoints
        bool append_indices = 0     # Append LoR indices used
    )

Given a sample of lines, this functions computes the minimum distance points (MDPs) for every possible combination of num_lines lines. The returned numpy array contains all MDPs that satisfy the following:

  1. Are within the cutoffs.

  2. Are closer to all the constituent LoRs than max_distance.

Parameters
sample_lines(M, N) numpy.ndarray

A 2D array of lines, where each line is defined by two points such that every row is formatted as [t, x1, y1, z1, x2, y2, z2, etc.]. It must have at least 2 lines and the combination size num_lines must be smaller or equal to the number of lines. Put differently: 2 <= num_lines <= len(sample_lines).

num_linesint

The number of lines in each combination of LoRs used to compute the MDP. This function considers every combination of numlines from the input sample_lines. It must be smaller or equal to the number of input lines sample_lines.

max_distancefloat

The maximum allowed distance between an MDP and its constituent lines. If any distance from the MDP to one of its lines is larger than max_distance, the MDP is thrown away.

cutoffs(6,) numpy.ndarray

An array of spatial cutoff coordinates with exactly 6 elements as [x_min, x_max, y_min, y_max, z_min, z_max]. If any MDP lies outside this region, it is thrown away.

append_indicesbool

A boolean specifying whether to include the indices of the lines used to compute each MDP. If False, the output array will only contain the [time, x, y, z] of the MDPs. If True, the output array will have extra columns [time, x, y, z, line_idx(1), …, line_idx(n)] where n = num_lines.

Returns
minpoints(M, N) numpy.ndarray

A 2D array of float`s containing the time and coordinates of the MDPs [time, x, y, z]. The time is computed as the average of the constituent lines. If `append_indices is True, then num_lines indices of the constituent lines are appended as extra columns: [time, x, y, z, line_idx1, line_idx2, ..].

Notes

There must be at least two lines in sample_lines and num_lines must be greater or equal to the number of lines (i.e. len(sample_lines)). Put another way: 2 <= num_lines <= len(sample_lines).

This is a low-level Cython function that does not do any checks on the input data - it is meant to be used in other modules / libraries. For a normal user, the pept.tracking.peptml function find_minpoints and class Minpoints are recommended as higher-level APIs. They do check the input data and are easier to use (for example, they automatically compute the cutoffs).

Examples

>>> import numpy as np
>>> from pept.utilities import find_minpoints
>>>
>>> lines = np.random.random((500, 7)) * 500
>>> num_lines = 3
>>> max_distance = 0.1
>>> cutoffs = np.array([0, 500, 0, 500, 0, 500], dtype = float)
>>>
>>> minpoints = find_minpoints(lines, num_lines, max_distance, cutoffs)