pept.tracking.HDBSCAN#

class pept.tracking.HDBSCAN(true_fraction, max_tracers=1)[source]#

Bases: PointDataFilter

Use HDBSCAN to cluster some pept.PointData and append a cluster label to each point.

Filter signature:

PointData -> HDBSCAN.fit_sample -> PointData

The only free parameter to select is the true_fraction, a relative measure of the ratio of inliers to outliers. A noisy sample - e.g. first pass of clustering of cutpoints - may need a value of 0.15. A cleaned up dataset - e.g. a second pass of clustering - can work with 0.6.

You can also set the maximum number of tracers visible at any one time in the system in max_tracers (default 1). This is simply an inverse scaling factor, but the true_fraction is quite robust with varying numbers of tracers.

__init__(true_fraction, max_tracers=1)[source]#

Methods

__init__(true_fraction[, max_tracers])

copy([deep])

Create a deep copy of an instance of this class, including all inner attributes.

fit(point_data[, executor, max_workers, verbose])

Apply self.fit_sample (implemented by subclasses) according to the execution policy.

fit_sample(sample_points)

load(filepath)

Load a saved / pickled PEPTObject object from filepath.

save(filepath)

Save a PEPTObject instance as a binary pickle object.

copy(deep=True)#

Create a deep copy of an instance of this class, including all inner attributes.

fit(point_data, executor='joblib', max_workers=None, verbose=True)#

Apply self.fit_sample (implemented by subclasses) according to the execution policy. Simply return a list of processed samples. If you need a reduction step (e.g. stack all processed samples), apply it in the subclass.

static load(filepath)#

Load a saved / pickled PEPTObject object from filepath.

Most often the full object state was saved using the .save method.

Parameters
filepathfilename or file handle

If filepath is a path (rather than file handle), it is relative to where python is called.

Returns
pept.PEPTObject subclass instance

The loaded object.

Examples

Save a LineData instance, then load it back:

>>> lines = pept.LineData([[1, 2, 3, 4, 5, 6, 7]])
>>> lines.save("lines.pickle")
>>> lines_reloaded = pept.LineData.load("lines.pickle")
save(filepath)#

Save a PEPTObject instance as a binary pickle object.

Saves the full object state, including inner attributes, in a portable binary format. Load back the object using the load method.

Parameters
filepathfilename or file handle

If filepath is a path (rather than file handle), it is relative to where python is called.

Examples

Save a LineData instance, then load it back:

>>> lines = pept.LineData([[1, 2, 3, 4, 5, 6, 7]])
>>> lines.save("lines.pickle")
>>> lines_reloaded = pept.LineData.load("lines.pickle")
fit_sample(sample_points)[source]#