PEPT-ML#

PEPT using Machine Learning is a modern clustering-based tracking method that was developed specifically for noisy, fast applications.

If you are using PEPT-ML in your research, you are kindly asked to cite the following paper:

Nicuşan AL, Windows-Yule CR. Positron emission particle tracking using machine learning. Review of Scientific Instruments. 2020 Jan 1;91(1):013329.

PEPT-ML one pass of clustering recipe#

The LoRs are first converted into Cutpoints, which are then assigned cluster labels using HDBSCAN; the cutpoints are then grouped into clusters using SplitLabels and the clusters’ Centroids are taken as the particle locations. Finally, stack all centroids into a single PointData.

import pept
from pept.tracking import *

max_tracers = 1

pipeline = pept.Pipeline([
    Cutpoints(max_distance = 0.5),
    HDBSCAN(true_fraction = 0.15, max_tracers = max_tracers),
    SplitLabels() + Centroids(error = True),
    Stack(),
])

locations = pipeline.fit(lors)

PEPT-ML second pass of clustering recipe#

The particle locations will always have a bit of scatter to them; we can tighten those points into accurate, dense trajectories using a second pass of clustering.

Set a very small sample size and maximum overlap to minimise temporal smoothing effects, then recluster the tracer locations, split according to cluster label, compute centroids, and stack into a final PointData.

import pept
from pept.tracking import *

max_tracers = 1

pipeline = pept.Pipeline([
    Stack(sample_size = 30 * max_tracers, overlap = 30 * max_tracers - 1),
    HDBSCAN(true_fraction = 0.6, max_tracers = max_tracers),
    SplitLabels() + Centroids(error = True),
    Stack(),
])

locations2 = pipeline.fit(lors)

PEPT-ML complete recipe#

Including two passes of clustering and trajectory separation: Including an example ADAC Forte data initisalisation, two passes of clustering, trajectory separation, plotting and saving trajectories as CSV.

# Import what we need from the `pept` library
import pept
from pept.tracking import *
from pept.plots import PlotlyGrapher, PlotlyGrapher2D


# Open interactive plots in the web browser
import plotly
plotly.io.renderers.default = "browser"


# Initialise data from file and set sample size and overlap
filepath = "DS1.da01"
max_tracers = 1

lors = pept.scanners.adac_forte(
    filepath,
    sample_size = 200 * max_tracers,
    overlap = 150 * max_tracers,
)


# Select only the first 1000 samples of LoRs for testing; comment out for all
lors = lors[:1000]


# Create PEPT-ML processing pipeline
pipeline = pept.Pipeline([

    # First pass of clustering
    Cutpoints(max_distance = 0.2),
    HDBSCAN(true_fraction = 0.15, max_tracers = max_tracers),
    SplitLabels() + Centroids(error = True),

    # Second pass of clustering
    Stack(sample_size = 30 * max_tracers, overlap = 30 * max_tracers - 1),
    HDBSCAN(true_fraction = 0.6, max_tracers = max_tracers),
    SplitLabels() + Centroids(),

    # Trajectory separation
    Segregate(window = 20 * max_tracers, cut_distance = 10),
    Stack(),
])


# Process all samples in `lors` in parallel, using `max_workers` threads
trajectories = pipeline.fit(lors)


# Save trajectories as CSV
trajectories.to_csv(filepath + ".csv")

# Save as a fast binary; you can load them back with `pept.load("path")`
trajectories.save(filepath + ".pickle")


# Plot trajectories - first a 2D timeseries, then all 3D positions
PlotlyGrapher2D().add_timeseries(trajectories).show()
PlotlyGrapher().add_points(trajectories).show()

Example of a Complex Processing Pipeline#

This is an example of “production code” used for tracking tracers in pipe flow imaging, where particles enter and leave the field of view regularly. This pipeline automatically:

  • Sets an optimum adaptive time window.

  • Runs a first pass of clustering, keeping track of the number of LoRs around the tracers (cluster_size) and relative location error (error).

  • Removes locations with too few LoRs or large errors.

  • Sets a new optimum adaptive time window for a second pass of clustering.

  • Removes spurious points while the tracer is out of the field of view.

  • Separates out different tracer trajectories, removes the ones with too few points and groups them by trajectory.

  • Computes the tracer velocity at each location on each trajectory.

  • Removes locations at the edges of the detectors.

Each individual step could be an entire program on its own; with the PEPT Pipeline architecture, they can be chained in 17 lines of Python code, automatically using all processors available on parallelisable sections.

# Create PEPT-ML processing pipeline
pipeline = pept.Pipeline([
    OptimizeWindow(200, overlap = 0.5) + Debug(1),

    # First pass of clustering
    Cutpoints(max_distance = 0.2),
    HDBSCAN(true_fraction = 0.15),
    SplitLabels() + Centroids(cluster_size = True, error = True),

    # Remove erroneous points
    Condition("cluster_size > 30, error < 20"),

    # Second pass of clustering
    OptimizeWindow(30, overlap = 0.95) + Debug(1),
    HDBSCAN(true_fraction = 0.6),
    SplitLabels() + Centroids(),

    # Remove sparse points in time
    OutOfViewFilter(200.),

    # Trajectory separation
    Segregate(window = 20, cut_distance = 20, min_trajectory_size = 20),
    Condition("label >= 0"),
    GroupBy("label"),

    # Velocity computation
    Velocity(11),
    Velocity(11, absolute = True),

    # Cutoff points outside this region
    Condition("y > 100, y < 500"),

    Stack(),
])