# PEPT-ML#

PEPT using Machine Learning is a modern clustering-based tracking method that was developed specifically for noisy, fast applications.

If you are using PEPT-ML in your research, you are kindly asked to cite the following paper:

Nicuşan AL, Windows-Yule CR. Positron emission particle tracking using machine learning. Review of Scientific Instruments. 2020 Jan 1;91(1):013329.

## PEPT-ML one pass of clustering recipe#

The LoRs are first converted into `Cutpoints`

, which are then assigned cluster labels using `HDBSCAN`

; the cutpoints are then grouped into clusters using `SplitLabels`

and the clusters’ `Centroids`

are taken as the particle locations. Finally, stack all centroids into a single `PointData`

.

```
import pept
from pept.tracking import *
max_tracers = 1
pipeline = pept.Pipeline([
Cutpoints(max_distance = 0.5),
HDBSCAN(true_fraction = 0.15, max_tracers = max_tracers),
SplitLabels() + Centroids(error = True),
Stack(),
])
locations = pipeline.fit(lors)
```

## PEPT-ML second pass of clustering recipe#

The particle locations will always have a bit of *scatter* to them; we can *tighten* those points into accurate, dense trajectories using a *second pass of clustering*.

Set a very small sample size and maximum overlap to minimise temporal smoothing effects, then recluster the tracer locations, split according to cluster label, compute centroids, and stack into a final `PointData`

.

```
import pept
from pept.tracking import *
max_tracers = 1
pipeline = pept.Pipeline([
Stack(sample_size = 30 * max_tracers, overlap = 30 * max_tracers - 1),
HDBSCAN(true_fraction = 0.6, max_tracers = max_tracers),
SplitLabels() + Centroids(error = True),
Stack(),
])
locations2 = pipeline.fit(lors)
```

## PEPT-ML complete recipe#

Including two passes of clustering and trajectory separation: Including an example ADAC Forte data initisalisation, two passes of clustering, trajectory separation, plotting and saving trajectories as CSV.

```
# Import what we need from the `pept` library
import pept
from pept.tracking import *
from pept.plots import PlotlyGrapher, PlotlyGrapher2D
# Open interactive plots in the web browser
import plotly
plotly.io.renderers.default = "browser"
# Initialise data from file and set sample size and overlap
filepath = "DS1.da01"
max_tracers = 1
lors = pept.scanners.adac_forte(
filepath,
sample_size = 200 * max_tracers,
overlap = 150 * max_tracers,
)
# Select only the first 1000 samples of LoRs for testing; comment out for all
lors = lors[:1000]
# Create PEPT-ML processing pipeline
pipeline = pept.Pipeline([
# First pass of clustering
Cutpoints(max_distance = 0.2),
HDBSCAN(true_fraction = 0.15, max_tracers = max_tracers),
SplitLabels() + Centroids(error = True),
# Second pass of clustering
Stack(sample_size = 30 * max_tracers, overlap = 30 * max_tracers - 1),
HDBSCAN(true_fraction = 0.6, max_tracers = max_tracers),
SplitLabels() + Centroids(),
# Trajectory separation
Segregate(window = 20 * max_tracers, cut_distance = 10),
Stack(),
])
# Process all samples in `lors` in parallel, using `max_workers` threads
trajectories = pipeline.fit(lors)
# Save trajectories as CSV
trajectories.to_csv(filepath + ".csv")
# Save as a fast binary; you can load them back with `pept.load("path")`
trajectories.save(filepath + ".pickle")
# Plot trajectories - first a 2D timeseries, then all 3D positions
PlotlyGrapher2D().add_timeseries(trajectories).show()
PlotlyGrapher().add_points(trajectories).show()
```

## Example of a Complex Processing Pipeline#

This is an example of “production code” used for tracking tracers in pipe flow imaging, where particles enter and leave the field of view regularly. This pipeline automatically:

Sets an optimum adaptive time window.

Runs a first pass of clustering, keeping track of the number of LoRs around the tracers (

`cluster_size`

) and relative location error (`error`

).Removes locations with too few LoRs or large errors.

Sets a new optimum adaptive time window for a second pass of clustering.

Removes spurious points while the tracer is out of the field of view.

Separates out different tracer trajectories, removes the ones with too few points and groups them by trajectory.

Computes the tracer velocity at each location on each trajectory.

Removes locations at the edges of the detectors.

Each individual step could be an entire program on its own; with the PEPT
`Pipeline`

architecture, they can be chained in 17 lines of Python code,
automatically using all processors available on parallelisable sections.

```
# Create PEPT-ML processing pipeline
pipeline = pept.Pipeline([
OptimizeWindow(200, overlap = 0.5) + Debug(1),
# First pass of clustering
Cutpoints(max_distance = 0.2),
HDBSCAN(true_fraction = 0.15),
SplitLabels() + Centroids(cluster_size = True, error = True),
# Remove erroneous points
Condition("cluster_size > 30, error < 20"),
# Second pass of clustering
OptimizeWindow(30, overlap = 0.95) + Debug(1),
HDBSCAN(true_fraction = 0.6),
SplitLabels() + Centroids(),
# Remove sparse points in time
OutOfViewFilter(200.),
# Trajectory separation
Segregate(window = 20, cut_distance = 20, min_trajectory_size = 20),
Condition("label >= 0"),
GroupBy("label"),
# Velocity computation
Velocity(11),
Velocity(11, absolute = True),
# Cutoff points outside this region
Condition("y > 100, y < 500"),
Stack(),
])
```