The PEPT Library’s Documentation#

A Python library that unifies Positron Emission Particle Tracking (PEPT) research, including tracking, simulation, data analysis and visualisation tools.

Positron Emission Particle Tracking#

PEPT is a technique developed at the University of Birmingham which allows the non-invasive, three-dimensional tracking of one or more ‘tracer’ particles through particulate, fluid or multiphase systems. The technique allows particle or fluid motion to be tracked with sub-millimetre accuracy and sub-millisecond temporal resolution and, due to its use of highly-penetrating 511keV gamma rays, can be used to probe the internal dynamics of even large, dense, optically opaque systems - making it ideal for industrial as well as scientific applications.

PEPT is performed by radioactively labelling a particle with a positron- emitting radioisotope such as fluorine-18 (18F) or gallium-68 (68Ga), and using the back-to-back gamma rays produced by electron-positron annihilation events in and around the tracer to triangulate its spatial position. Each detected gamma ray represents a line of response (LoR).

Transforming LoRs into trajectories using `pept`

Transforming gamma rays, or lines of response (left) into individual tracer trajectories (right) using the pept library. Depicted is experimental data of two tracers rotating at 42 RPM, imaged using the University of Birmingham Positron Imaging Centre’s parallel screens PEPT camera.

Tutorials and Documentation#

A very fast-paced introduction to Python is available here (Google Colab tutorial link); it is aimed at engineers whose background might be a few lines written MATLAB, as well as moderate C/C++ programmers.

A beginner-friendly tutorial for using the pept package is available here (Google Colab link).

The links above point to Google Colaboratory, a Jupyter notebook-hosting website that lets you combine text with Python code, executing it on Google servers. Pretty neat, isn’t it?

Performance#

Significant effort has been put into making the algorithms in this package as fast as possible. Most computationally intensive code has been implemented in Cython, C or C++ and allows policy-based parallel execution, either on shared-memory machines using joblib / ThreadPoolExecutor, or on distributed computing clusters using mpi4py.futures.MPIPoolExecutor.

Indices and tables#

Getting Started#

These instructions will help you get started with PEPT data analysis.

Prerequisites#

This package supports Python 3.6 and above - it is built and tested for Python 3.6, 3.7 and 3.8 on Windows, Linux and macOS (thanks to conda-forge, which is awesome!).

You can install it using the batteries-included Anaconda distribution or the bare-bones Python interpreter. You can also check out our Python and pept tutorials.

Installation#

The easiest and quickest installation, if you are using Anaconda:

conda install -c conda-forge pept

You can also install the latest release version of pept from PyPI:

pip install --upgrade pept

Or you can install the development version from the GitHub repository:

pip install -U git+https://github.com/uob-positron-imaging-centre/pept

Tutorials#

The main purpose of the PEPT library is to provide a common, consistent foundation for PEPT-related algorithms, including tracer tracking, visualisation and post-processing tools - such that they can be used interchangeably, mixed and matched for any PEPT camera and system. Virtually all PEPT processing routine follows these steps:

  1. Convert raw gamma camera / scanner data into 3D lines (i.e. the captured gamma rays, or lines of response - LoRs).

  2. Take a sample of lines, locate tracer locations, then repeat for the next samples.

  3. Separate out individual tracer trajectories.

  4. Visualise and post-process trajectories.

For these algorithm-agnostic steps, PEPT provides five base data structures upon which the rest of the library is built:

  1. pept.LineData: general 3D line samples, formatted as [time, x1, y1, z1, x2, y2, z2, extra…].

  2. pept.PointData: general 3D point samples, formatted as [time, x, y, z, extra…].

  3. pept.Pixels: single 2D pixellised space with physical dimensions, including fast line traversal.

  4. pept.Voxels: single 3D voxellised space with physical dimensions, including fast line traversal.

For example, once you convert your PEPT data - from any scanner - into pept.LineData, all the algorithms in this library can be used.

All the data structures above are built on top of NumPy and integrate natively with the rest of the Python / SciPy ecosystem. The rest of the PEPT library is organised into submodules:

  1. pept.scanners: converters between native scanner data and the base data structures.

  2. pept.tracking: radioactive tracer tracking algorithms, e.g. the Birmingham method, PEPT-ML, FPI.

  3. pept.plots: PEPT data visualisation subroutines.

  4. pept.utilities: general-purpose helpers, e.g. read_csv, traverse3d.

  5. pept.processing: PEPT-oriented post-processing algorithms, e.g. VectorField3D.


If you are new to the PEPT library, we recommend going through this interactive online notebook, which introduces all the fundamental concepts of the library:

Once you get the idea of LineData samples, Pipeline and PlotlyGrapher, you can use these copy-pastable tutorials to build PEPT data analysis pipelines tailored to your specific systems.

Absolute Basics#

The main purpose of the pept library is to provide a common, consistent foundation for PEPT-related algorithms, including tracer tracking, visualisation and post-processing tools - such that they can be used interchangeably, mixed and matched for different systems. Virtually any PEPT processing routine follows these steps:

  1. Convert raw gamma camera / scanner data into 3D lines (i.e. the captured gamma rays, or lines of response - LoRs).

  2. Take a sample of lines, locate tracer locations, then repeat for the next samples.

  3. Separate out individual tracer trajectories.

  4. Visualise and post-process trajectories.

For these algorithm-agnostic steps, pept provides five base data structures upon which the rest of the library is built:

  1. pept.LineData: general 3D line samples, formatted as [time, x1, y1, z1, x2, y2, z2, extra…].

  2. pept.PointData: general 3D point samples, formatted as [time, x, y, z, extra…].

  3. pept.Pixels: single 2D pixellised space with physical dimensions, including fast line traversal.

  4. pept.Voxels: single 3D voxellised space with physical dimensions, including fast line traversal.

All the data structures above are built on top of NumPy and integrate natively with the rest of the Python / SciPy ecosystem. The rest of the pept library is organised into submodules:

  • pept.scanners: converters between native scanner data and the base classes.

  • pept.tracking: radioactive tracer tracking algorithms, e.g. the Birmingham method, PEPT-ML, FPI.

  • pept.plots: PEPT data visualisation subroutines.

  • pept.utilities: general-purpose helpers, e.g. read_csv, traverse3d.

  • pept.processing: PEPT-oriented post-processing algorithms, e.g. occupancy2d.

pept.LineData#

Generally, PEPT Lines of Response (LoRs) are lines in 3D space, each defined by two points, regardless of the geometry of the scanner used. This class is used to wrap LoRs (or any lines!), efficiently yielding samples of lines of an adaptive sample_size and overlap.

It is an abstraction over PET / PEPT scanner geometries and data formats, as once the raw LoRs (be they stored as binary, ASCII, etc.) are transformed into the common LineData format, any tracking, analysis or visualisation algorithm in the pept package can be used interchangeably. Moreover, it provides a stable, user-friendly interface for iterating over LoRs in samples - this is useful for tracking algorithms, as they generally take a few LoRs (a sample), produce a tracer position, then move to the next sample of LoRs, repeating the procedure. Using overlapping samples is also useful for improving the tracking rate of the algorithms.

Here are some basic examples of creating and using LineData samples - you’re very much invited to copy and run them!

Initialise a LineData instance containing 10 lines with a sample_size of 3.

>>> import pept
>>> import numpy as np
>>> lines_raw = np.arange(70).reshape(10, 7)
>>> print(lines_raw)
[[ 0  1  2  3  4  5  6]
 [ 7  8  9 10 11 12 13]
 [14 15 16 17 18 19 20]
 [21 22 23 24 25 26 27]
 [28 29 30 31 32 33 34]
 [35 36 37 38 39 40 41]
 [42 43 44 45 46 47 48]
 [49 50 51 52 53 54 55]
 [56 57 58 59 60 61 62]
 [63 64 65 66 67 68 69]]

>>> line_data = pept.LineData(lines_raw, sample_size = 3)
>>> line_data
pept.LineData (samples: 3)
--------------------------
sample_size = 3
overlap = 0
lines =
  (rows: 10, columns: 7)
  [[ 0.  1. ...  5.  6.]
   [ 7.  8. ... 12. 13.]
   ...
   [56. 57. ... 61. 62.]
   [63. 64. ... 68. 69.]]
columns = ['t', 'x1', 'y1', 'z1', 'x2', 'y2', 'z2']
attrs = {}

Access samples using subscript notation. Notice how the samples are consecutive, as overlap is 0 by default.

>>> line_data[0]
pept.LineData (samples: 1)
--------------------------
sample_size = 3
overlap = 0
lines =
  (rows: 3, columns: 7)
  [[ 0.  1. ...  5.  6.]
   [ 7.  8. ... 12. 13.]
   [14. 15. ... 19. 20.]]
columns = ['t', 'x1', 'y1', 'z1', 'x2', 'y2', 'z2']
attrs = {}

>>> line_data[1]
pept.LineData (samples: 1)
--------------------------
sample_size = 3
overlap = 0
lines =
  (rows: 3, columns: 7)
  [[21. 22. ... 26. 27.]
   [28. 29. ... 33. 34.]
   [35. 36. ... 40. 41.]]
columns = ['t', 'x1', 'y1', 'z1', 'x2', 'y2', 'z2']
attrs = {}

Now set an overlap of 2; notice how the number of samples changes:

>>> len(line_data)     # Number of samples
3

>>> line_data.overlap = 2
>>> len(line_data)
8

Saving / Loading Data#

All PEPT objects can be saved in an efficient binary format using pept.save and pept.load:

import pept
import numpy as np

# Create some dummy data
lines_raw = np.arange(70).reshape((10, 7)
lines = pept.LineData(lines_raw)

# Save data
pept.save("data.pickle", lines)

# Load data
lines_loaded = pept.load("data.pickle")

The binary approach has the advantage of preserving all your metadata saved in the object instances - e.g. columns, sample_size - allowing the full state to be reloaded.

Matrix-like data like pept.LineData and pept.PointData can also be saved in a slower, but human-readable CSV format using their class methods .to_csv; such tabular data can then be reinitialised using pept.read_csv:

# Save data in CSV format
lines.to_csv("data.csv")

# Load data back - *this will be a simple NumPy array!*
lines_raw = pept.read_csv("data.csv")

# Need to put the array back into a `pept.LineData`
lines = pept.LineData(lines_raw)

Plotting#

Interactive 3D Plots#

The easiest method of plotting 3D PEPT-like data is using the pept.plots.PlotlyGrapher interactive grapher:

# Plotting some example 3D lines
import pept
from pept.plots import PlotlyGrapher
import numpy as np

lines_raw = np.arange(70).reshape((10, 7))
lines = pept.LineData(lines_raw)

PlotlyGrapher().add_lines(lines).show()
# Plotting some example 3D points
import pept
from pept.plots import PlotlyGrapher
import numpy as np

points_raw = np.arange(40).reshape((10, 4))
points = pept.PointData(points_raw)

PlotlyGrapher().add_points(points).show()

The PlotlyGrapher object allows straightforward subplots creation:

# Plot the example 3D lines and points on separate subplots
grapher = PlotlyGrapher(cols = 2)

grapher.add_lines(lines)                        # col = 1 by default
grapher.add_points(points, col = 2)

grapher.show()
# Plot the example 3D lines and points on separate subplots
grapher = PlotlyGrapher(rows = 2, cols = 2)

grapher.add_lines(lines, col = 2)               # row = 1 by default
grapher.add_points(points, row = 2, col = 2)

grapher.show()
Adding Colourbars#

By default, the last column of a dataset is used to colour-code the resulting points:

from pept.plots import PlotlyGrapher
PlotlyGrapher().add_points(point_data).show()   # Colour-codes by the last column

You can change the column used to colour-code points using a numeric index (e.g. first column colorbar_col = 0, second to last column colorbar_col = -2) or named column (e.g. colorbar_col = "error"):

PlotlyGrapher().add_points(point_data, colorbar_col = -2).show()
PlotlyGrapher().add_points(point_data, colorbar_col = "label").show()   # Coloured by trajectory
PlotlyGrapher().add_points(point_data, colorbar_col = "v").show()       # Coloured by velocity

As a PlotlyGrapher will often manage multiple subplots, one shouldn’t include explicit colourbars on the sides for each dataset plotted. Therefore, colourbars are hidden by default; add a colourbar by setting its title:

PlotlyGrapher().add_points(points, colorbar_title = "Velocity").show()
Histogram of Tracking Errors#

The Centroids(error = True) filter appends a column “error” representing the relative error in the tracked position. You can select a named column via indexing, e.g. trajectories["error"]; you can then plot a histogram of the relative errors with:

import plotly.express as px
px.histogram(trajectories["error"]).show()          # Large values are noise
px.histogram(trajectories["cluster_size"]).show()   # Small values are noise

It is often useful to remove points with an error higher than a certain value, e.g. 20 mm:

trajectories = Condition("error < 20").fit(trajectories)

# Or simply append the `Condition` to the `pept.Pipeline`
pipeline = pept.Pipeline([
    ...
    Condition("cluster_size > 30, error < 20"),
    ...
])
Exporting Plotly Graphs as Images#

The standard output of the Plotly grapher is an interactive HTML webpage; however, this can lead to large file sizes or memory overflows. Plotly allows for graphs to be exported as images to alleviate some of these issues.

Ensure you have imported:

import plotly.express as px
import kaleido
import plotly.io as pio

There are two main ways of exporting as images:

# Save the inner plotly.Figure attribute of a `grapher`
# Format can be changed to other image formats
# Width and height can be adjusted to give the desired image size
grapher.fig.write_image("figure.png", width=2560, height=1440)
Modifying the Underlying Figure#

You can access the Plotly figure wrapped and managed by a PlotlyGrapher using the .fig attribute:

grapher.fig.update_layout(xaxis_title = "Pipe Length (mm)")

Initialising PEPT Scanner Data#

The pept.scanners submodule contains converters between scanner specific data formats (e.g. parallel screens / ASCII, modular camera / binary) and the pept base classes, allowing simple initialisation of pept.LineData from different sources.

ADAC Forte#

The parallel screens detector used at Birmingham can output binary list-mode data, which can be converted using pept.scanners.adac_forte(binary_file):

import pept

lines = pept.scanners.adac_forte("binary_file.da01")

If you have multiple files from the same experiment, e.g. “data.da01”, “data.da02”, etc., you can stitch them all together using a glob, “data.da*”:

import pept

# Multiple files starting with `binary_file.da`
lines = pept.scanners.adac_forte("binary_file.da*")
Parallel Screens#

If you have your data as a CSV containing 5 columns [t, x1, y1, x2, y2] representing the coordinates of the two points defining an LoR on two parallel screens, you can use pept.scanners.parallel_screens to insert the missing coordinates and get the LoRs into the general LineData format [t, x1, y1, z1, x2, y2, z2]:

import pept

screen_separation = 500
lines = pept.scanners.parallel_screens(csv_or_array, screen_separation)
Modular Camera#

Your modular camera data can be initialised using pept.scanners.modular_camera:

import pept

lines = pept.scanners.modular_camera(filepath)

Adaptive Sampling#

Perhaps the most important decision a PEPT user must make is how the LoRs are divided into samples. The two most common approaches are:

Fixed sample size: a constant number of elements per sample, with potential overlap between samples.

  • Advantages: effectively adapts spatio-temporal resolution, with higher accuracy in more active PEPT scanner regions.

  • Disadvantages: when a tracer exits the field of view, the last LoRs will be joined with the first LoRs when the tracer re-enters the scanner in the same samples.

Fixed time window: a constant time interval in which LoRs are aggregated, with potential overlap.

  • Advantages: robust to tracers moving out of the field of view.

  • Disadvantages: non-adaptive temporal resolution.

The two approaches can be combined into a single pept.AdaptiveWindow, which works as a fixed time window, except when more LoRs are encountered than a given limit, in which case the time window is shrunk - hence adapting the time window depending on how many LoRs are intercepted in a given window.

import pept

# A time window of 5 ms shrinking when encountering more than 200 LoRs
lors = pept.LineData(..., sample_size = pept.AdaptiveWindow(5.0, 200))

# A time window of 12 ms with the number of LoRs capped at 400 LoRs and an overlap of 6 ms
lors = pept.scanners.adac_forte(
    ...,
    sample_size = pept.AdaptiveWindow(12., 200),
    overlap = pept.AdaptiveWindow(6.),
)

Moreover, if an ideal number of LoRs is selected, there exists an optimum time window for which most samples will have roughly this ideal number of LoRs, except when the tracer is out of the field of view, or it’s static. This can be automatically selected using pept.tracking.OptimizeWindow:

import pept
import pept.tracking as pt

# Find an adaptive time window that is ideal for about 200 LoRs per sample
lors = pept.LineData(...)
lors = pt.OptimizeWindow(ideal_elems = 200).fit(lors)

OptimizeWindow can be used at the start of a pipeline; an optional overlap parameter can be used to define an overlap as a ratio to the ideal time window found. For example, if the ideal time window found is 100 ms, an overlap of 0.5 will result in an overlapping time interval of 50 ms:

import pept
from pept.tracking import *

pipeline = pept.Pipeline([
    OptimizeWindow(200),
    BirminghamMethod(fopt = 0.5),
    Stack(),
])

locations = pipeline.fit(lors)

The Birmingham Method#

The Birmingham Method is an efficient, analytical technique for tracking tracers using the LoRs from PEPT data.

If you are using it in your research, you are kindly asked to cite the following paper:

Parker DJ, Broadbent CJ, Fowles P, Hawkesworth MR, McNeil P. Positron emission particle tracking-a technique for studying flow within engineering equipment. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment. 1993 Mar 10;326(3):592-607.

Birmingham Method recipe#
import pept
from pept.tracking import *

pipeline = pept.Pipeline([
    BirminghamMethod(fopt = 0.5),
    Stack(),
])

locations = pipeline.fit(lors)
Recipe with Trajectory Separation#
import pept
from pept.tracking import *

pipeline = pept.Pipeline([
    BirminghamMethod(fopt = 0.5),
    Segregate(window = 20, cut_distance = 10),
    Stack(),
])

locations = pipeline.fit(lors)

PEPT-ML#

PEPT using Machine Learning is a modern clustering-based tracking method that was developed specifically for noisy, fast applications.

If you are using PEPT-ML in your research, you are kindly asked to cite the following paper:

Nicuşan AL, Windows-Yule CR. Positron emission particle tracking using machine learning. Review of Scientific Instruments. 2020 Jan 1;91(1):013329.

PEPT-ML one pass of clustering recipe#

The LoRs are first converted into Cutpoints, which are then assigned cluster labels using HDBSCAN; the cutpoints are then grouped into clusters using SplitLabels and the clusters’ Centroids are taken as the particle locations. Finally, stack all centroids into a single PointData.

import pept
from pept.tracking import *

max_tracers = 1

pipeline = pept.Pipeline([
    Cutpoints(max_distance = 0.5),
    HDBSCAN(true_fraction = 0.15, max_tracers = max_tracers),
    SplitLabels() + Centroids(error = True),
    Stack(),
])

locations = pipeline.fit(lors)
PEPT-ML second pass of clustering recipe#

The particle locations will always have a bit of scatter to them; we can tighten those points into accurate, dense trajectories using a second pass of clustering.

Set a very small sample size and maximum overlap to minimise temporal smoothing effects, then recluster the tracer locations, split according to cluster label, compute centroids, and stack into a final PointData.

import pept
from pept.tracking import *

max_tracers = 1

pipeline = pept.Pipeline([
    Stack(sample_size = 30 * max_tracers, overlap = 30 * max_tracers - 1),
    HDBSCAN(true_fraction = 0.6, max_tracers = max_tracers),
    SplitLabels() + Centroids(error = True),
    Stack(),
])

locations2 = pipeline.fit(lors)
PEPT-ML complete recipe#

Including two passes of clustering and trajectory separation: Including an example ADAC Forte data initisalisation, two passes of clustering, trajectory separation, plotting and saving trajectories as CSV.

# Import what we need from the `pept` library
import pept
from pept.tracking import *
from pept.plots import PlotlyGrapher, PlotlyGrapher2D


# Open interactive plots in the web browser
import plotly
plotly.io.renderers.default = "browser"


# Initialise data from file and set sample size and overlap
filepath = "DS1.da01"
max_tracers = 1

lors = pept.scanners.adac_forte(
    filepath,
    sample_size = 200 * max_tracers,
    overlap = 150 * max_tracers,
)


# Select only the first 1000 samples of LoRs for testing; comment out for all
lors = lors[:1000]


# Create PEPT-ML processing pipeline
pipeline = pept.Pipeline([

    # First pass of clustering
    Cutpoints(max_distance = 0.2),
    HDBSCAN(true_fraction = 0.15, max_tracers = max_tracers),
    SplitLabels() + Centroids(error = True),

    # Second pass of clustering
    Stack(sample_size = 30 * max_tracers, overlap = 30 * max_tracers - 1),
    HDBSCAN(true_fraction = 0.6, max_tracers = max_tracers),
    SplitLabels() + Centroids(),

    # Trajectory separation
    Segregate(window = 20 * max_tracers, cut_distance = 10),
    Stack(),
])


# Process all samples in `lors` in parallel, using `max_workers` threads
trajectories = pipeline.fit(lors)


# Save trajectories as CSV
trajectories.to_csv(filepath + ".csv")

# Save as a fast binary; you can load them back with `pept.load("path")`
trajectories.save(filepath + ".pickle")


# Plot trajectories - first a 2D timeseries, then all 3D positions
PlotlyGrapher2D().add_timeseries(trajectories).show()
PlotlyGrapher().add_points(trajectories).show()
Example of a Complex Processing Pipeline#

This is an example of “production code” used for tracking tracers in pipe flow imaging, where particles enter and leave the field of view regularly. This pipeline automatically:

  • Sets an optimum adaptive time window.

  • Runs a first pass of clustering, keeping track of the number of LoRs around the tracers (cluster_size) and relative location error (error).

  • Removes locations with too few LoRs or large errors.

  • Sets a new optimum adaptive time window for a second pass of clustering.

  • Removes spurious points while the tracer is out of the field of view.

  • Separates out different tracer trajectories, removes the ones with too few points and groups them by trajectory.

  • Computes the tracer velocity at each location on each trajectory.

  • Removes locations at the edges of the detectors.

Each individual step could be an entire program on its own; with the PEPT Pipeline architecture, they can be chained in 17 lines of Python code, automatically using all processors available on parallelisable sections.

# Create PEPT-ML processing pipeline
pipeline = pept.Pipeline([
    OptimizeWindow(200, overlap = 0.5) + Debug(1),

    # First pass of clustering
    Cutpoints(max_distance = 0.2),
    HDBSCAN(true_fraction = 0.15),
    SplitLabels() + Centroids(cluster_size = True, error = True),

    # Remove erroneous points
    Condition("cluster_size > 30, error < 20"),

    # Second pass of clustering
    OptimizeWindow(30, overlap = 0.95) + Debug(1),
    HDBSCAN(true_fraction = 0.6),
    SplitLabels() + Centroids(),

    # Remove sparse points in time
    OutOfViewFilter(200.),

    # Trajectory separation
    Segregate(window = 20, cut_distance = 20, min_trajectory_size = 20),
    Condition("label >= 0"),
    GroupBy("label"),

    # Velocity computation
    Velocity(11),
    Velocity(11, absolute = True),

    # Cutoff points outside this region
    Condition("y > 100, y < 500"),

    Stack(),
])

Feature Point Identification#

FPI is a modern voxel-based tracer-location algorithm that can reliably work with unknown numbers of tracers in fast and noisy environments.

It was successfully used to track fast-moving radioactive tracers in pipe flows at the Virginia Commonwealth University. If you use this algorithm in your work, please cite the following paper:

Wiggins C, Santos R, Ruggles A. A feature point identification method for positron emission particle tracking with multiple tracers. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment. 2017 Jan 21; 843:22-8.

FPI Recipe#

As FPI works on voxelized representations of the LoRs, the Voxelize filter is first used before FPI itself:

import pept
from pept.tracking import *

resolution = (100, 100, 100)

pipeline = pept.Pipeline([
    Voxelize(resolution),
    FPI(w = 3, r = 0.4),
    Stack(),
])

locations = pipeline.fit(lors)

Tracking Errors#

When processing more difficult datasets - scattering environments, low tracer activities, etc. - it is often useful to use some tracer statistics to remove erroneous locations.

Most PEPT algorithms will include some measure of the tracer location errors, for example:

  • The Centroids(error = True) filter appends a column “error” representing the standard deviation of the distances from the computed centroid to the constituent points. For a 500 mm scanner, a spread in a tracer location of 100 mm is clearly an erroneous point.

  • The Centroids(cluster_size = True) filter appends a column “cluster_size” representing the number of points used to compute the centroid. If a sample of 200 LoRs yields a tracer location computed from 5 points, it is clearly noise.

  • The BirminghamMethod filter includes a column “error” representing the standard deviation of the distances from the tracer position to the constituent LoRs.

Histogram of Tracking Errors#

You can select a named column via string indexing, e.g. trajectories["error"]; you can then plot a histogram of the relative errors with:

import plotly.express as px
px.histogram(trajectories["error"]).show()          # Large values are noise
px.histogram(trajectories["cluster_size"]).show()   # Small values are noise

It is often useful to remove points with an error higher than a certain value, e.g. 20 mm:

trajectories = Condition("error < 20").fit(trajectories)

# Or simply append the `Condition` to the `pept.Pipeline`
pipeline = pept.Pipeline([
    ...
    Condition("cluster_size > 30, error < 20"),
    ...
])

Trajectory Separation#

Segregate Points#

We can separate out trajectory segments / points that are spatio-temporally far away to:

  1. Remove spurious, noisy points.

  2. Separate out continuous trajectory segments.

The spatio-temporal metric differentiates between points that may be in the same location at different times. This is achieved by allowing points to be connected in a sliding window approach.

The pept.tracking.Segregate algorithm works by creating a Minimum Spanning Tree (MST, or minimum distance path) connecting all points in a dataset, then cutting all paths longer than a cut_distance. All distinct segments are assigned a trajectory 'label' (integer starting from 0); trajectories with fewer than min_trajectory_size points are considered noise (label -1).

from pept.tracking import *

trajectories = Segregate(window = 20, cut_distance = 10.).fit(trajectories)

Consider all trajectories with fewer than 50 points to be noise:

segr = Segregate(
    window = 20,
    cut_distance = 10.,
    min_trajectory_size = 50,
)

trajectories = segr.fit(trajectories)

This step adds a new column “label”. We can group each individual trajectory into a list with GroupBy:

traj_list = GroupBy("label").fit(trajectories)
traj_list[0]    # First trajectory

[New in pept-0.5.2] Only connect points within a time interval; in other words, disconnect into different trajectories points whose timestamps are further apart than max_time_interval:

 segr = Segregate(
    window = 20,
    cut_distance = 10.,
    min_trajectory_size = 50,
    max_time_interval = 2000,       # Disconnect tracer with >2s gap
)

trajectories = segr.fit(trajectories)

Filtering Data#

There are many filters in pept.tracking, you can check out the Manual at the top of the page for a complete list. Here are examples with the most important ones.

Remove#

Simply remove a column:

from pept.tracking import *

trajectories = Remove("label").fit(trajectories)

Or multiple columns:

trajectories = Remove("label", "error").fit(trajectories)
Condition#

One of the most important filters, selecting only data that satisfies a condition:

from pept.tracking import *

trajectories = Condition("error < 15").fit(trajectories)

Or multiple ones:

trajectories = Condition("error < 15, label >= 0").fit(trajectories)

In the simplest case, you just use the column name as the first argument followed by a comparison. If the column name is not the first argument, you must use single quotes:

trajectories = Condition("0 <= 'label'").fit(trajectories)

You can also use filtering functions from NumPy in the condition string (i.e. anything returning a boolean mask):

# Remove all NaNs and Infs from the 'x' column
trajectories = Condition("np.isfinite('x')")

Finally, you can supply your own function receiving a NumPy array of the data and returning a boolean mask:

def last_column_filter(data):
    return data[:, -1] > 10

trajectories = Condition(last_column_filter).fit(trajectories)

Or using inline functions (i.e. lambda):

# Select points within a vertical cylinder with radius 10
trajectories = Condition(lambda x: x[:, 1]**2 + x[:, 3]**2 < 10**2).fit(trajectories)
SamplesCondition#

While Condition is applied on individual points, we could filter entire samples - for example, select only trajectories with more than 30 points:

import pept.tracking as pt

long_trajectories_filter = pept.Pipeline([
    # Segregate points - appends "label" column
    pt.Segregate(window = 20, cut_distance = 10),

    # Group points into samples; e.g. sample 1 contains all points with label 1
    pt.GroupBy("label"),

    # Now each sample is an entire trajectory which we can filter
    pt.SamplesCondition("sample_size > 30"),

    # And stack all remaining samples back into a single PointData
    pt.Stack(),
])

long_trajectories = long_trajectories_filter.fit(trajectories)

The condition can be based on the sample itself, e.g. keep only samples that lie completely beyond x=0:

# Keep only samples for which all points' X coordinates are bigger than 0
Condition("np.all(sample['x'] > 0)")
GroupBy#

Stack all samples (i.e. LineData or PointData) and split them into a list according to a named / numeric column index:

from pept.tracking import *

group_list = GroupBy("label").fit(trajectories)
RemoveStatic#

Remove tracer locations when it spends more than time_window without moving more than max_distance:

from pept.tracking import *

# Remove positions that spent more than 2 seconds without moving more than 20 mm
nonstatic = RemoveStatic(time_window = 2000, max_distance = 20).fit(trajectories)

Extracting Velocities#

When extracting post-processed data from tracer trajectories for e.g. probability distributions, it is often important to sample data at fixed timesteps. As PEPT is natively a Lagrangian technique where tracers can be tracked more often in more sensitive areas of the gamma scanners, we have to convert those “randomly-sampled” positions into regular timesteps using Interpolate.

First, Segregate points into individual, continuous trajectory segments, GroupBy according to each trajectory’s label, then Interpolate into regular timesteps, then compute each point’s Velocity (dimension-wise or absolute) and finally Stack them back into a PointData:

from pept.tracking import *

pipe_vel = pept.Pipeline([
    Segregate(window = 20, cut_distance = 10.),
    GroupBy("label"),
    Interpolate(timestep = 5.),
    Velocity(window = 7),
    Stack(),
])

trajectories = pipe_vel.fit(trajectories)

The Velocity step appends columns ["vx", "vy", "vz"] (default) or ["v"] (if absolute = True). You can add both if you wish:

from pept.tracking import *

pept.Pipeline([
    Segregate(window = 20, cut_distance = 10.),
    GroupBy("label"),
    Interpolate(timestep = 5.),
    Velocity(window = 7),                       # Appends vx, vy, vz
    Velocity(window = 7, absolute = True),      # Appends v
    Stack(),
])

Interpolating Timesteps#

When extracting post-processed data from tracer trajectories for e.g. probability distributions, it is often important to sample data at fixed timesteps. As PEPT is natively a Lagrangian technique where tracers can be tracked more often in more sensitive areas of the gamma scanners, we have to convert those “randomly-sampled” positions into regular timesteps using Interpolate.

First, Segregate points into individual, continuous trajectory segments, GroupBy according to each trajectory’s label, then Interpolate into regular timesteps and finally Stack them back into a PointData:

from pept.tracking import *

pipe = pept.Pipeline([
    Segregate(window = 20, cut_distance = 10.),
    GroupBy("label"),
    Interpolate(timestep = 5.),
    Stack(),
])

trajectories = pipe.fit(trajectories)

Manual#

All public pept subroutines are fully documented here, along with copy-pastable examples. The base functionality is summarised below; the rest of the library is organised into submodules, which you can access on the left. You can also use the Search bar in the top left to go directly to what you need.

We really appreciate all help with writing useful documentation; if you feel something can be improved, or would like to share some example code, by all means get in contact with us - or be a superhero and click Edit this page on the right and submit your changes to the GitHub repository directly!

Base Functions#

pept.read_csv(filepath_or_buffer[, ...])

Read a given number of lines from a file and return a numpy array of the values.

pept.load(filepath)

Load a binary saved / pickled object from filepath.

pept.save(filepath, obj)

Save an object obj instance as a binary file at filepath.

Base Classes#

pept.LineData(lines[, sample_size, overlap, ...])

A class for PEPT LoR data iteration, manipulation and visualisation.

pept.PointData(points[, sample_size, ...])

A class for general PEPT point-like data iteration, manipulation and visualisation.

pept.Pixels(pixels_array, xlim, ylim, **kwargs)

A class managing a 2D pixel space with physical dimensions, including tools for pixel manipulation and visualisation.

pept.Voxels(voxels_array, xlim, ylim, zlim, ...)

A class managing a 3D voxel space with physical dimensions, including tools for voxel manipulation and visualisation.

pept.Pipeline(transformers)

A PEPT processing pipeline, chaining multiple Filter and Reducer for efficient, parallel execution.

Auxilliaries#

pept.TimeWindow(window)

Define a sample_size as a fixed time window / slice.

pept.AdaptiveWindow(window[, max_elems])

Define a sample_size as a time window with a maximum limit of elements.

Base / Abstract Classes (pept.base)#

pept.base.PEPTObject()

Base class for all PEPT-oriented objects.

pept.base.IterableSamples(data[, ...])

An class for iterating through an array (or array-like) in samples with potential overlap.

pept.base.Transformer()

Base class for PEPT filters (transforming a sample into another) and reducers (transforming a list of samples).

pept.base.Filter()

Abstract class from which PEPT filters inherit.

pept.base.Reducer()

Abstract class from which PEPT reducers inherit.

pept.base.PointDataFilter()

An abstract class that defines a filter for samples of pept.PointData.

pept.base.LineDataFilter()

An abstract class that defines a filter for samples of pept.LineData.

pept.base.VoxelsFilter()

An abstract class that defines a filter for samples of pept.Voxels.

Initialising Scanner Data (pept.scanners)#

Convert data from different PET / PEPT scanner geometries and data formats into the common base classes.

The PEPT base classes PointData, LineData, and VoxelData are abstractions over the type of data that may be encountered in the context of PEPT (e.g. LoRs are LineData, trajectory points are PointData). Once the raw data is transformed into the common formats, any tracking, analysis or visualisation algorithm in the pept package can be used interchangeably.

The pept.scanners subpackage provides modules for transforming the raw data from different PET / PEPT scanner geometries (parallel screens, modular cameras, etc.) and data formats (binary, ASCII, etc.) into the common base classes.

If you’d like to integrate another scanner geometry or raw data format into this package, you can check out the pept.scanners.parallel_screens function as an example. This usually only involves writing a single function by hand; then all functionality from LineData will be available to your new data format, for free.

pept.scanners.adac_forte(filepath[, ...])

Initialise PEPT lines of response (LoRs) from a binary file outputted by the ADAC Forte parallel screen detector list mode (common file extension ".da01").

pept.scanners.parallel_screens(...[, ...])

Initialise PEPT LoRs for parallel screens PET/PEPT detectors from an input CSV file or array.

pept.scanners.ADACGeometricEfficiency(separation)

Compute the geometric efficiency of a parallel screens PEPT detector at different 3D coordinates using Antonio Guida's formula [R71bb6ad21a70-1].

pept.scanners.modular_camera(data_file[, ...])

Initialise PEPT LoRs from the modular camera DAQ.

Tracking Algorithms (pept.tracking)#

Tracer location, identification and tracking algorithms.

The pept.tracking subpackage hosts different tracking algorithms, working with both the base classes, as well as with generic NumPy arrays.

All algorithms here are either pept.base.Filter or pept.base.Reducer subclasses, implementing the .fit and .fit_sample methods; here is an example using PEPT-ML:

>>> from pept.tracking import *
>>>
>>> cutpoints = Cutpoints(0.5).fit(lines)
>>> clustered = HDBSCAN(0.15).fit(cutpoints)
>>> centres = (SplitLabels() + Centroids() + Stack()).fit(clustered)

Once the processing steps have been tuned (see the Tutorials), you can chain all filters into a pept.Pipeline for efficient, parallel execution:

>>> pipeline = (
>>>     Cutpoints(0.5) +
>>>     HDBSCAN(0.15) +
>>>     SplitLabels() + Centroids() + Stack()
>>> )
>>> centres = pipeline.fit(lines)

If you would like to implement a PEPT algorithm, all you need to do is to subclass a pept.base.Filter and define the method .fit_sample(sample) - and you get parallel execution and pipeline chaining for free!

>>> import pept
>>>
>>> class NewAlgorithm(pept.base.LineDataFilter):
>>>     def __init__(self, setting1, setting2 = None):
>>>         self.setting1 = setting1
>>>         self.setting2 = setting2
>>>
>>>     def fit_sample(self, sample: pept.LineData):
>>>         processed_points = ...
>>>         return pept.PointData(processed_points)
Tracking Optimisation#

pept.tracking.Debug([verbose, max_samples])

Print types and statistics about the objects being processed in a pept.Pipeline.

pept.tracking.OptimizeWindow(ideal_elems[, ...])

Automatically determine optimum adaptive time window to have an ideal number of elements per sample.

General-Purpose Transformers#

pept.tracking.Stack([sample_size, overlap])

Stack iterables - for example a list[pept.LineData] into a single pept.LineData, a list[list] into a flattened list.

pept.tracking.SplitLabels([remove_labels, ...])

Split a sample of data into unique label values, optionally removing noise and extracting _lines attributes.

pept.tracking.SplitAll

alias of GroupBy

pept.tracking.GroupBy(column)

Stack all samples and split them into a list according to a named / numeric column index.

pept.tracking.Centroids([error, ...])

Compute the geometric centroids of a list of samples of points.

pept.tracking.LinesCentroids([remove, ...])

Compute the minimum distance point of some pept.LineData while iteratively removing a fraction of the furthest lines.

pept.tracking.Condition(*conditions)

Select only data satisfying multiple conditions, given as a string, a function or list thereof; e.g.

pept.tracking.SamplesCondition(*conditions)

Select only samples satisfying multiple conditions, given as a string, a function or list thereof; e.g.

pept.tracking.Remove(*columns)

Remove columns (either column names or indices) from pept.LineData or pept.PointData.

pept.tracking.Swap(*swaps[, inplace])

Swap two columns in a LineData or PointData.

Space Transformers#

pept.tracking.Voxelize(number_of_voxels[, ...])

Asynchronously voxelize samples of lines from a pept.LineData.

pept.tracking.Interpolate(timestep[, ...])

Interpolate between data points at a fixed sampling rate; useful for Eulerian fields computation.

pept.tracking.Reorient([dimensions, basis, ...])

Rotate a dataset such that it is oriented according to its principal axes.

pept.tracking.OutOfViewFilter([max_time, k])

Remove tracer locations that are sparse in time - ie the k-th nearest detection is later than max_time.

pept.tracking.RemoveStatic(time_window, ...)

Remove parts of a PointData where the tracer remains static.

Tracer Locating Algorithms#

pept.tracking.BirminghamMethod([fopt, get_used])

The Birmingham Method is an efficient, analytical technique for tracking tracers using the LoRs from PEPT data.

pept.tracking.Cutpoints(max_distance[, ...])

Transform LoRs (a pept.LineData instance) into cutpoints (a pept.PointData instance) for clustering, in parallel.

pept.tracking.Minpoints(num_lines, max_distance)

Transform LoRs (a pept.LineData instance) into minpoints (a pept.PointData instance) for clustering, in parallel.

pept.tracking.HDBSCAN(true_fraction[, ...])

Use HDBSCAN to cluster some pept.PointData and append a cluster label to each point.

pept.tracking.FPI([w, r, lld_counts, verbose])

FPI is a modern voxel-based tracer-location algorithm that can reliably work with unknown numbers of tracers in fast and noisy environments.

Trajectory Separation Algorithms#

pept.tracking.Segregate(window, cut_distance)

Segregate the intertwined points from multiple trajectories into individual paths.

pept.tracking.Reconnect(tmax, dmax[, ...])

Best-fit trajectory segment reconstruction based on time, distance and arbitrary tracer signatures.

Time Of Flight Algorithms#

pept.tracking.TimeOfFlight([...])

Compute the positron annihilation locations of each LoR as given by the Time Of Flight (ToF) data of the two LoR timestamps.

pept.tracking.CutpointsToF([max_distance, ...])

Compute cutpoints from all pairs of lines whose Time Of Flight-predicted locations are closer than max_distance.

pept.tracking.GaussianDensity([sigma])

Append weights according to the Gaussian distribution that best fits the samples of points.

Post Processing Algorithms#

pept.tracking.Velocity(window[, degree, ...])

Append the dimension-wise or absolute velocity to samples of points using a 2D fitted polynomial in a rolling window mode.

Post Processing (pept.processing)#

The PEPT-oriented post-processing suite, including occupancy grid, vector velocity fields, etc.

This module contains fast, robust functions that operate on PEPT-like data and integrate with the pept library’s base classes.

Probability / Residence Distributions#

pept.processing.DynamicProbability2D(...[, ...])

Compute the 2D probability distribution of some tracer quantity (eg velocity in each cell).

pept.processing.DynamicProbability3D(...[, ...])

Compute the 3D probability distribution of some tracer quantity (eg velocity in each cell).

pept.processing.ResidenceDistribution2D(diameter)

Compute the 2D residence distribution of some tracer quantity (eg time spent in each cell).

pept.processing.ResidenceDistribution3D(diameter)

Compute the 3D residence distribution of some tracer quantity (eg time spent in each cell).

Vector Grids#

pept.processing.VectorField2D(diameter[, ...])

Compute a 2D vector field - effectively two 2D grids computed from two columns, for example X and Y velocities.

pept.processing.VectorGrid2D(xpixels, ypixels)

Object produced by VectorField2D storing 2 grids of voxels xpixels, ypixels, for example velocity vector fields / quiver plots.

pept.processing.VectorField3D(diameter[, ...])

Compute a 3D vector field - effectively three 3D grids computed from three columns, for example X, Y and Z velocities.

pept.processing.VectorGrid3D(xvoxels, ...)

Object produced by VectorField3D storing 3 grids of voxels xvoxels, yvoxels, zvoxels, for example velocity vector fields / quiver plots.

Mixing Quantification#

pept.processing.LaceyColors(p1, p2[, ax1, ...])

Compute Lacey-like mixing image, with streamlines passing through plane 1 being split into Red and Blue tracers, then evaluated into RGB pixels at a later point in plane 2.

pept.processing.LaceyColorsLinear(directory, ...)

Apply the LaceyColors mixing algorithm at num_divisions equidistant points between p1 and p2, saving images at each step in directory.

pept.processing.RelativeDeviations(p1, p2[, ...])

Compute a Lagrangian mixing measure - the changes in tracer distances to a point P1 as they pass through an "inlet" plane and another point P2 when reaching an "outlet" plane.

pept.processing.RelativeDeviationsLinear(...)

Apply the RelativeDeviations mixing algorithm at num_divisions equidistant points between p1 and p2, saving histogram images at each step in directory.

pept.processing.AutoCorrelation([lag, ...])

Compute autocorrelation of multiple measures (eg YZ velocities) as a function of a lagging variable (eg time).

pept.processing.SpatialProjections(...[, ...])

Project multiple tracer passes onto a moving 2D plane along a given direction between start and end coordinates, saving each frame in directory.

Visualisation (pept.plots)#

PEPT-oriented visulisation tools.

Visualisation functions and classes for PEPT data, transparently working with both pept base classes and raw NumPy arrays (e.g. PlotlyGrapher.add_lines handles both pept.LineData and (N, 7) NumPy arrays).

The PlotlyGrapher class creates interactive, publication-ready 3D figures with optional subplots which can also be exported to portable HTML files. The PlotlyGrapher2D class is its two-dimensional counterpart, handling e.g. pept.Pixels.

pept.plots.format_fig(fig[, size, font, ...])

Format a Plotly figure to a consistent theme for the Nature Computational Science journal.

pept.plots.histogram(data[, nbins, ...])

Create histogram of data with PEPT-relevant defaults for plotly.express.histogram.

pept.plots.make_video(frames[, output, fps, ...])

Stitch multiple images from frames into a video saved to output.

pept.plots.PlotlyGrapher([rows, cols, xlim, ...])

A class for PEPT data visualisation using Plotly-based 3D graphs.

pept.plots.PlotlyGrapher2D([rows, cols, ...])

A class for PEPT data visualisation using Plotly-based 2D graphs.

pept.utilities#

PEPT-oriented utility functions.

The utility functions include low-level optimised Cython functions (e.g. find_cutpoints) that are of common interest across the pept package, as well as I/O functions, parallel maps and pixel/voxel traversal algorithms.

Even though the functions are grouped in directories (subpackages) and files (modules), unlike the rest of the package, they are all imported into the pept.utilities root, so that their import paths are not too long.

pept.utilities.find_cutpoints(const double[, ...)

Compute the cutpoints from a given array of lines.

pept.utilities.find_minpoints(const double[, ...)

Compute the minimum distance points (MDPs) from all combinations of num_lines lines given in an array of lines sample_lines.

pept.utilities.group_by_column(data_array, ...)

Group the rows in a 2D data_array based on the unique values in a given column_to_separate, returning the groups as a list of numpy arrays.

pept.utilities.number_of_lines(...)

Return the number of lines (or rows) in a file.

pept.utilities.read_csv(filepath_or_buffer)

Read a given number of lines from a file and return a numpy array of the values.

pept.utilities.read_csv_chunks(...[, ...])

Read chunks of data from a file lazily, returning numpy arrays of the values.

pept.utilities.parallel_map_file(func, ...)

Utility for parallelising (read CSV chunk -> process chunk) workflows.

pept.utilities.traverse2d(double[, , ...)

Fast pixel traversal for 2D lines (or LoRs).

pept.utilities.traverse3d(double[, , , ...)

Fast voxel traversal for 3D lines (or LoRs).

pept.utilities.ChunkReader(...[, skiprows, ...])

Class for fast, on-demand reading / parsing and iteration over chunks of data from CSV files.

pept.simulation#

pept.simulation.Simulator(trajectory, ...[, ...])

Simulate PEPT data.

Contributing#

The pept library is not a one-man project; it is being built, improved and extended continuously (directly or indirectly) by an international team of researchers of diverse backgrounds - including programmers, mathematicians and chemical / mechanical / nuclear engineers. Want to contribute and become a PEPTspert yourself? Great, join the team!

There are multiple ways to help:

  • Open an issue mentioning any improvement you think pept could benefit from.

  • Write a tutorial or share scripts you’ve developed that we can add to the pept documentation to help other people in the future.

  • Share your PEPT-related algorithms - tracking, post-processing, visualisation, anything really! - so everybody can benefit from them.

Want to be a superhero and contribute code directly to the library itself? Grand - fork the project, add your code and submit a pull request (if that sounds like gibberish but you’re an eager programmer, check this article). We are more than happy to work with you on integrating your code into the library and, if helpful, we can schedule a screen-to-screen meeting for a more in-depth discussion about the pept package architecture.

Naturally, anything you contribute to the library will respect your authorship - protected by the strong GPL v3.0 open-source license (see the “Licensing” section below). If you include published work, please add a pointer to your publication in the code documentation.

Licensing#

The pept package is GPL v3.0 licensed. In non-lawyer terms, the key points of this license are:

  • You can view, use, copy and modify this code _freely_.

  • Your modifications must _also_ be licensed with GPL v3.0 or later.

  • If you share your modifications with someone, you have to include the source code as well.

Essentially do whatever you want with the code, but don’t try selling it saying it’s yours :). This is a community-driven project building upon many other wonderful open-source projects (NumPy, Plotly, even Python itself!) without which pept simply would not have been possible. GPL v3.0 is indeed a very strong copyleft license; it was deliberately chosen to maintain the openness and transparency of great software and progress, and respect the researchers pushing PEPT forward. Frankly, open collaboration is way more efficient than closed, for-profit competition.

Citing#

If you used this codebase or any software making use of it in a scientific publication, we ask you to cite the following paper:

Nicuşan AL, Windows-Yule CR. Positron emission particle tracking using machine learning. Review of Scientific Instruments. 2020 Jan 1;91(1):013329. https://doi.org/10.1063/1.5129251

Because pept is a project bringing together the expertise of many people, it hosts multiple algorithms that were developed and published in other papers. Please check the documentation of the pept algorithms you are using in your research and cite the original papers mentioned accordingly.

References#

Papers presenting PEPT algorithms included in this library: 1, 2, 3.

1

Parker DJ, Broadbent CJ, Fowles P, Hawkesworth MR, McNeil P. Positron emission particle tracking-a technique for studying flow within engineering equipment. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment. 1993 Mar 10;326(3):592-607.

2

Nicuşan AL, Windows-Yule CR. Positron emission particle tracking using machine learning. Review of Scientific Instruments. 2020 Jan 1;91(1):013329.

3

Wiggins C, Santos R, Ruggles A. A feature point identification method for positron emission particle tracking with multiple tracers. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment. 2017 Jan 21;843:22-8.

Pages