pept.PointData¶

class pept.PointData(points, sample_size=None, overlap=None, columns=['t', 'x', 'y', 'z'], **kwargs)[source]¶

Bases: pept.base.iterable_samples.IterableSamples

A class for general PEPT point-like data iteration, manipulation and visualisation.

In the context of positron-based particle tracking, points are defined by a timestamp, 3D coordinates and any other extra information (such as trajectory label or some tracer signature). This class is used for the encapsulation of 3D points - be they tracer locations, cutpoints, etc. -, efficiently yielding samples of points of an adaptive sample_size and overlap.

Much like a complement to LineData, PointData is an abstraction over point-like data that may be encountered in the context of PEPT (e.g. pre-tracked tracer locations), as once the raw points are transformed into the common PointData format, any tracking, analysis or visualisation algorithm in the pept package can be used interchangeably. Moreover, it provides a stable, user-friendly interface for iterating over points in samples - this can be useful for tracking algorithms, as some take a few points (a sample), produce an accurate tracer location, then move to the next sample of points, repeating the procedure. Using overlapping samples is also useful for improving the time resolution of the algorithms.

This is the base class for point-like data; subroutines that accept and/or return PointData instances (or subclasses thereof) can be found throughout the pept package. If you’d like to create new algorithms based on them, you can check out the pept.tracking.peptml.cutpoints module as an example; the Cutpoints class receives a LineData instance, transforms the samples of LoRs into cutpoints, then initialises itself as a PointData subclass - thereby inheriting all its methods and attributes.

Raises

ValueError: If overlap >= sample_size. Overlap is required to be smaller than sample_size, unless sample_size is 0. Note that it can also be negative.

See also

pept.LineData: Encapsulate LoRs for ease of iteration and plotting.
pept.read_csv: Fast CSV file reading into numpy arrays.
pept.plots.PlotlyGrapher: Easy, publication-ready plotting of PEPT-oriented data.
pept.tracking.Cutpoints: Compute cutpoints from pept.LineData.

Notes

This class saves points as a C-contiguous numpy array for efficient access in C / Cython functions. The inner data can be mutated, but do not change the number of rows or columns after instantiating the class.

Examples

Initialise a PointData instance containing 10 points with a sample_size of 3.

>>> import numpy as np
>>> import pept
>>> points_raw = np.arange(40).reshape(10, 4)
>>> print(points_raw)
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]
 [24 25 26 27]
 [28 29 30 31]
 [32 33 34 35]
 [36 37 38 39]]

>>> point_data = pept.PointData(points_raw, sample_size = 3)
>>> point_data
pept.PointData (samples: 3)
---------------------------
sample_size = 3
overlap = 0
points =
  (rows: 10, columns: 4)
  [[ 0.  1.  2.  3.]
   [ 4.  5.  6.  7.]
   ...
   [32. 33. 34. 35.]
   [36. 37. 38. 39.]]
columns = ['t', 'x', 'y', 'z']
attrs = {}

Access samples using subscript notation. Notice how the samples are consecutive, as overlap is 0 by default.

>>> point_data[0]
pept.PointData (samples: 1)
---------------------------
sample_size = 3
overlap = 0
points =
  (rows: 3, columns: 4)
  [[ 0.  1.  2.  3.]
   [ 4.  5.  6.  7.]
   [ 8.  9. 10. 11.]]
columns = ['t', 'x', 'y', 'z']
attrs = {}

>>> point_data[1]
pept.PointData (samples: 1)
---------------------------
sample_size = 3
overlap = 0
points =
  (rows: 3, columns: 4)
  [[12. 13. 14. 15.]
   [16. 17. 18. 19.]
   [20. 21. 22. 23.]]
columns = ['t', 'x', 'y', 'z']
attrs = {}

Now set an overlap of 2; notice how the number of samples changes:

>>> len(point_data)         # Number of samples
3

>>> point_data.overlap = 2
>>> len(point_data)
8

Notice how rows are repeated from one sample to the next when accessing them, because overlap is now 2:

>>> point_data[0]
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]])

>>> point_data[1]
array([[ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.],
       [12., 13., 14., 15.]])

Now change sample_size to 5 and notice again how the number of samples changes:

>>> len(point_data)
8

>>> point_data.sample_size = 5
>>> len(point_data)
2

>>> point_data[0]
pept.PointData (samples: 1)
---------------------------
sample_size = 3
overlap = 0
points =
  (rows: 3, columns: 4)
  [[ 0.  1.  2.  3.]
   [ 4.  5.  6.  7.]
   [ 8.  9. 10. 11.]]
columns = ['t', 'x', 'y', 'z']
attrs = {}

>>> point_data[1]
pept.PointData (samples: 1)
---------------------------
sample_size = 3
overlap = 0
points =
  (rows: 3, columns: 4)
  [[ 4.  5.  6.  7.]
   [ 8.  9. 10. 11.]
   [12. 13. 14. 15.]]
columns = ['t', 'x', 'y', 'z']
attrs = {}

Notice how the samples do not cover the whole input points_raw array, as the last lines are omitted - think of the sample_size and overlap. They are still inside the inner points attribute of point_data though:

>>> point_data.points
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.],
       [12., 13., 14., 15.],
       [16., 17., 18., 19.],
       [20., 21., 22., 23.],
       [24., 25., 26., 27.],
       [28., 29., 30., 31.],
       [32., 33., 34., 35.],
       [36., 37., 38., 39.]])

Attributes

points(N, M) numpy.ndarray: An (N, M >= 4) numpy array that stores the points as time, followed by cartesian (3D) coordinates of the point, followed by any extra information. The data columns are then [time, x, y, z, etc].
sample_sizeint, list[int], pept.TimeWindow or None: Defining the number of points in a sample; if it is an integer, a constant number of points are returned per sample. If it is a list of integers, sample i will have length sample_size[i]. If it is a pept.TimeWindow instance, each sample will span a fixed time window. If None, custom sample sizes are returned as per the samples_indices attribute.
overlapint, pept.TimeWindow or None: Defining the overlapping points between consecutive samples. If int, constant numbers of points are used. If pept.TimeWindow, the overlap will be a constant time window across the data timestamps (first column). If None, custom sample sizes are defined as per the samples_indices attribute.
samples_indices(S, 2) numpy.ndarray: A 2D NumPy array of integers, where row i defines the i-th sample’s start and end row indices, i.e. sample[i] == data[samples_indices[i, 0]:samples_indices[i, 1]]. The sample_size and overlap are simply friendly interfaces to setting the samples_indices.
columns(M,) list[str]: A list of strings with the same number of columns as points containing each column’s name.
attrsdict[str, Any]: A dictionary of other attributes saved on this class. Attribute names starting with an underscore are considered “hidden”.

__init__(points, sample_size=None, overlap=None, columns=['t', 'x', 'y', 'z'], **kwargs)[source]¶

PointData class constructor.

Parameters

points(N, M) numpy.ndarray: An (N, M >= 4) numpy array that stores points (or any generic 2D set of data). It expects that the first column is time, followed by cartesian (3D) coordinates of points, followed by any extra information the user needs. The data columns are then [time, x, y, z, etc].
sample_sizeint, default 0: An int` that defines the number of points that should be returned when iterating over points. A sample_size of 0 yields all the data as one single sample.
overlapint, default 0: An int that defines the overlap between two consecutive samples that are returned when iterating over points. An overlap of 0 means consecutive samples, while an overlap of (sample_size - 1) implies incrementing the samples by one. A negative overlap means skipping values between samples. An error is raised if overlap is larger than or equal to sample_size.
columnsList[str], default [“t”, “x”, “y”, “z”]: A list of strings corresponding to the column labels in points.
**kwargsextra keyword arguments: Any extra attributes to set on the class instance.

Raises

ValueError: If line_data does not have (N, M) shape, where M >= 4.

Methods

`__init__`(points[, sample_size, overlap, columns])	PointData class constructor.
`copy`([deep, data, extra, hidden])	Construct a similar object, optionally with different data.
`extra_attrs`()
`hidden_attrs`()
`load`(filepath)	Load a saved / pickled PEPTObject object from filepath.
`plot`([sample_indices, ax, alt_axes, …])	Plot points from selected samples using matplotlib.
`save`(filepath)	Save a PEPTObject instance as a binary pickle object.
`to_csv`(filepath[, delimiter])	Write the inner points to a CSV file.

Attributes

`attrs`
`columns`
`data`
`overlap`
`points`
`sample_size`
`samples_indices`

property points¶

to_csv(filepath, delimiter=' ')[source]¶

Write the inner points to a CSV file.

Write all points stored in the class to a CSV file.

Parameters

filepathfilename or file handle: If filepath is a path (rather than file handle), it is relative to where python is called.
delimiterstr, default ” “: The delimiter used to separate the values in the CSV file.

plot(sample_indices=Ellipsis, ax=None, alt_axes=False, colorbar_col=- 1)[source]¶

Plot points from selected samples using matplotlib.

Returns matplotlib figure and axes objects containing all points included in the samples selected by sample_indices. sample_indices may be a single sample index (e.g. 0), an iterable of indices (e.g. [1,5,6]), or an Ellipsis (…) for all samples.

Parameters

sample_indicesint or iterable or Ellipsis, default Ellipsis: The index or indices of the samples of points. An int signifies the sample index, an iterable (list-like) signifies multiple sample indices, while an Ellipsis (…) signifies all samples. The default is … (all points).
axmpl_toolkits.mplot3D.Axes3D object, optional: The 3D matplotlib-based axis for plotting. If undefined, new Matplotlib figure and axis objects are created.
alt_axesbool, default False: If True, plot using the alternative PEPT-style axes convention: z is horizontal, y points upwards. Because Matplotlib cannot swap axes, this is achieved by swapping the parameters in the plotting call (i.e. plt.plot(x, y, z) -> plt.plot(z, x, y)).
colorbar_colint, default -1: The column in the data samples that will be used to color the points. The default is -1 (the last column).

Returns

fig, axmatplotlib figure and axes objects

Notes

Plotting all points is very computationally-expensive for matplotlib. It is recommended to only plot a couple of samples at a time, or use the faster pept.plots.PlotlyGrapher.

Examples

Plot the points from sample 1 in a PointData instance:

>>> point_data = pept.PointData(...)
>>> fig, ax = point_data.plot(1)
>>> fig.show()

Plot the points from samples 0, 1 and 2:

>>> fig, ax = point_data.plot([0, 1, 2])
>>> fig.show()

property attrs¶

property columns¶

copy(deep=True, data=None, extra=True, hidden=True, **attrs)¶: Construct a similar object, optionally with different data. If extra, extra attributes are propagated; same for hidden.

property data¶

extra_attrs()¶

hidden_attrs()¶

static load(filepath)¶

Load a saved / pickled PEPTObject object from filepath.

Most often the full object state was saved using the .save method.

Parameters

filepathfilename or file handle: If filepath is a path (rather than file handle), it is relative to where python is called.

Returns

pept.PEPTObject subclass instance: The loaded object.

Examples

Save a LineData instance, then load it back:

>>> lines = pept.LineData([[1, 2, 3, 4, 5, 6, 7]])
>>> lines.save("lines.pickle")

>>> lines_reloaded = pept.LineData.load("lines.pickle")

property overlap¶

property sample_size¶

property samples_indices¶

save(filepath)¶

Save a PEPTObject instance as a binary pickle object.

Saves the full object state, including inner attributes, in a portable binary format. Load back the object using the load method.

Parameters

filepathfilename or file handle: If filepath is a path (rather than file handle), it is relative to where python is called.

Examples

Save a LineData instance, then load it back:

>>> lines = pept.LineData([[1, 2, 3, 4, 5, 6, 7]])
>>> lines.save("lines.pickle")

>>> lines_reloaded = pept.LineData.load("lines.pickle")

pept.LineData pept.Pixels