PEPT-ML#

PEPT using Machine Learning is a modern clustering-based tracking method that was developed specifically for noisy, fast applications.

If you are using PEPT-ML in your research, you are kindly asked to cite the following paper:

Nicuşan AL, Windows-Yule CR. Positron emission particle tracking using machine learning. Review of Scientific Instruments. 2020 Jan 1;91(1):013329.

PEPT-ML one pass of clustering recipe#

The LoRs are first converted into Cutpoints, which are then assigned cluster labels using HDBSCAN; the cutpoints are then grouped into clusters using SplitLabels and the clusters’ Centroids are taken as the particle locations. Finally, stack all centroids into a single PointData.

import pept
from pept.tracking import *

max_tracers = 1

pipeline = pept.Pipeline([
    Cutpoints(max_distance = 0.5),
    HDBSCAN(true_fraction = 0.15, max_tracers = max_tracers),
    SplitLabels() + Centroids(error = True),
    Stack(),
])

locations = pipeline.fit(lors)

PEPT-ML second pass of clustering recipe#

The particle locations will always have a bit of scatter to them; we can tighten those points into accurate, dense trajectories using a second pass of clustering.

Set a very small sample size and maximum overlap to minimise temporal smoothing effects, then recluster the tracer locations, split according to cluster label, compute centroids, and stack into a final PointData.

import pept
from pept.tracking import *

max_tracers = 1

pipeline = pept.Pipeline([
    Stack(sample_size = 30 * max_tracers, overlap = 30 * max_tracers - 1),
    HDBSCAN(true_fraction = 0.6, max_tracers = max_tracers),
    SplitLabels() + Centroids(error = True),
    Stack(),
])

locations2 = pipeline.fit(lors)

PEPT-ML complete recipe#

Including two passes of clustering and trajectory separation: Including an example ADAC Forte data initisalisation, two passes of clustering, trajectory separation, plotting and saving trajectories as CSV.

# Import what we need from the `pept` library
import pept
from pept.tracking import *
from pept.plots import PlotlyGrapher, PlotlyGrapher2D


# Open interactive plots in the web browser
import plotly
plotly.io.renderers.default = "browser"


# Initialise data from file and set sample size and overlap
filepath = "DS1.da01"
max_tracers = 1

lors = pept.scanners.adac_forte(
    filepath,
    sample_size = 200 * max_tracers,
    overlap = 150 * max_tracers,
)


# Select only the first 1000 samples of LoRs for testing; comment out for all
lors = lors[:1000]


# Create PEPT-ML processing pipeline
pipeline = pept.Pipeline([

    # First pass of clustering
    Cutpoints(max_distance = 0.2),
    HDBSCAN(true_fraction = 0.15, max_tracers = max_tracers),
    SplitLabels() + Centroids(error = True),

    # Second pass of clustering
    Stack(sample_size = 30 * max_tracers, overlap = 30 * max_tracers - 1),
    HDBSCAN(true_fraction = 0.6, max_tracers = max_tracers),
    SplitLabels() + Centroids(),

    # Trajectory separation
    Segregate(window = 20 * max_tracers, cut_distance = 10),
    Stack(),
])


# Process all samples in `lors` in parallel, using `max_workers` threads
trajectories = pipeline.fit(lors)


# Save trajectories as CSV
trajectories.to_csv(filepath + ".csv")

# Save as a fast binary; you can load them back with `pept.load("path")`
trajectories.save(filepath + ".pickle")


# Plot trajectories - first a 2D timeseries, then all 3D positions
PlotlyGrapher2D().add_timeseries(trajectories).show()
PlotlyGrapher().add_points(trajectories).show()

Example of a Complex Processing Pipeline#

This is an example of “production code” used for tracking tracers in pipe flow imaging, where particles enter and leave the field of view regularly. This pipeline automatically:

Sets an optimum adaptive time window.
Runs a first pass of clustering, keeping track of the number of LoRs around the tracers (cluster_size) and relative location error (error).
Removes locations with too few LoRs or large errors.
Sets a new optimum adaptive time window for a second pass of clustering.
Removes spurious points while the tracer is out of the field of view.
Separates out different tracer trajectories, removes the ones with too few points and groups them by trajectory.
Computes the tracer velocity at each location on each trajectory.
Removes locations at the edges of the detectors.

Each individual step could be an entire program on its own; with the PEPT Pipeline architecture, they can be chained in 17 lines of Python code, automatically using all processors available on parallelisable sections.

# Create PEPT-ML processing pipeline
pipeline = pept.Pipeline([
    OptimizeWindow(200, overlap = 0.5) + Debug(1),

    # First pass of clustering
    Cutpoints(max_distance = 0.2),
    HDBSCAN(true_fraction = 0.15),
    SplitLabels() + Centroids(cluster_size = True, error = True),

    # Remove erroneous points
    Condition("cluster_size > 30, error < 20"),

    # Second pass of clustering
    OptimizeWindow(30, overlap = 0.95) + Debug(1),
    HDBSCAN(true_fraction = 0.6),
    SplitLabels() + Centroids(),

    # Remove sparse points in time
    OutOfViewFilter(200.),

    # Trajectory separation
    Segregate(window = 20, cut_distance = 20, min_trajectory_size = 20),
    Condition("label >= 0"),
    GroupBy("label"),

    # Velocity computation
    Velocity(11),
    Velocity(11, absolute = True),

    # Cutoff points outside this region
    Condition("y > 100, y < 500"),

    Stack(),
])