waltlabtools package

Submodules

waltlabtools.cal_curve module

class waltlabtools.cal_curve.CalCurve(model: Model | str = '4PL', agg_reps: str | Callable | None = 'median', coef_init: ndarray | None = None, warm_start: bool = False, solver: str = 'trf', lod_sds: float = 3, max_iter: int | None = None, ensure_2d: bool = False, sample_weight: str | None = '1/y', **kwargs)[source]

Bases: BaseEstimator, RegressorMixin, TransformerMixin

Calibration Curve transformer and regressor.

Parameters:

model (Model or str, default="4PL") – The model to use for the calibration curve. Can be an instance of a Model or a string representing the model name. Current available options are: - “linear” : Linear function. - “power” : Power function. - “Hill” : Hill function. - “logistic” : Logistic function. - “3PL” : Four-parameter logistic (3PL) function. - “4PL” : Four-parameter logistic (4PL) function. - “5PL” : Five-parameter logistic (5PL) function.
agg_reps (str Callable, or None, default="median") – Aggregation method for replicates. Can be a string representing an aggregation strategy or a callable function. Current available options are: - “median” - “mean” - “average” - “geomean”, “gmean” - “min” - “max” - “geothmetic_meandian”, “gmnd” - “first” - “last” If None, then no aggregation is performed.
coef_init (array-like, optional) – Initial coefficients for the model.
warm_start (bool, default=False) – Whether to reuse the solution of the previous call to fit as initialization.
solver (str, default="trf") – Solver to use for optimization. Options are “trf”, “dogbox”, or “lm”.
lod_sds (float, default=3) – Number of standard deviations for limit of detection calculation.
max_iter (int, optional) – Maximum number of iterations for the solver.
ensure_2d (bool, default=False) – Whether to ensure the input is 2-dimensional.

coef_

Coefficients of the fitted model.

Type:: ndarray

n_iter_

Number of iterations run by the solver.

Type:: int

lod_

Calculated limit of detection.

Type:: float

bound_lod(x_flat)[source]

conc(y)[source]

Estimate the concentration for given signal values.

Parameters:: y (array-like) – Signal values.
Returns:: conc – Estimated concentration values.
Return type:: array-like

fit(X, y, sample_weight=None)[source]

Fit the model to data.

Parameters:

X (array-like of shape (n_samples,) or (n_samples, n_features)) – Training data features, e.g., concentrations.
y (array-like of shape (n_samples,)) – Target signal values, e.g., AEB or fluorescence intensity.
sample_weight (str or array-like, optional) – Sample weights.

Returns:

Fitted estimator.

Return type:

CalCurve

from_data(*, x, y, model, lod_sds=3, corr='c4', force_lod: bool = False, weights='1/y^2', **kwargs)[source]

from_function(fun, inverse, lod=-inf, lod_sds=3, force_lod: bool = False, xscale='linear', yscale='linear')[source]

fun(x)[source]

inverse(y)[source]

inverse_transform(X)[source]

Estimate the concentration for given signal values.

Parameters:: y (array-like of shape (n_samples, n_features) or (n_samples,)) – Signal values.
Returns:: conc – Back-calculated concentration values.
Return type:: array-like

plot(fig: Figure | SubFigure | None = None, ax: Axes | None = None, **kwargs)[source]

Plot the calibration curve and calibrator points.

Parameters:

fig (matplotlib.figure.Figure, optional) – The figure object. If None, the current figure will be used. If there is no current figure, a new figure will be created.
ax (matplotlib.axes.Axes, optional) – The axes object. If None, the current axes will be used. If there is no current axes, a new axes will be created.
**kwargs – Additional keyword arguments for customizing the plot, such as xlabel, ylabel, and title.

predict(X)[source]

Predict signal using the calibration curve model.

Parameters:: X (array-like of shape (n_samples, n_features) or (n_samples,)) – Input data of concentrations.
Returns:: y – Predicted signal values.
Return type:: ndarray

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → CalCurve

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
Returns:: self – The updated object.
Return type:: object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → CalCurve

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

signal(X)[source]

Predict the signal (e.g., AEB) for given concentrations.

Parameters:: X (array-like) – Input data of concentrations.
Returns:: signal – Predicted signal values.
Return type:: array-like

transform(X)[source]

Transform concentrations into signal.

Parameters:: X (array-like of shape (n_samples, n_features) or (n_samples,)) – Input data of concentrations.
Returns:: y – Transformed signal values.
Return type:: ndarray

waltlabtools.cal_curve.limit_of_detection(blank: Any, inverse: CalCurve | Model | Callable | str, lod_sds: float = 3, corr: str = 'c4', coef=None, nan_policy='omit', **kwargs)[source]

Computes the limit of detection (LOD).

Parameters:

blank (array-like) – Signal (e.g., average number of enzymes per bead, AEB) of the zero calibrator. Must have at least two elements.
inverse_fun (function or CalCurve) – The functional form used for the calibration curve. If a function, it should accept the measurement reading (y, e.g., fluorescence) as its only argument and return the value (x, e.g., concentration). If inverse_fun is a CalCurve object, the LOD will be calculated from its inverse method.
sds (numeric, optional) – How many standard deviations above the mean should the background should the limit of detection be calculated at? Common values include 2.5 (Quanterix), 3 (Walt Lab), and 10 (lower limit of quantification, LLOQ).
corr ({"n", "n-1", "n-1.5", "c4"} or numeric, default "c4") –
The sample standard deviation under-estimates the population standard deviation for a normally distributed variable. Specifies how this should be addressed. Options:
- ”n” : Divide by the number of samples to yield the uncorrected sample standard deviation.
- ”n-1” : Divide by the number of samples minus one to yield the square root of the unbiased sample variance.
- ”n-1.5” : Divide by the number of samples minus 1.5 to yield the approximate unbiased sample standard deviation.
- ”c4” : Divide by the correction factor to yield the exact unbiased sample standard deviation.
- If numeric, gives the delta degrees of freedom.

Returns:

The limit of detection, in units of x (e.g., concentration).

Return type:

numeric

See also

c4: factor c4 for unbiased estimation of the standard deviation
std: unbiased estimate of the population standard deviation
numpy.std: standard deviation

waltlabtools.cal_curve.regress(*, x, y, model, weights='1/y^2', **kwargs)[source]

waltlabtools.core module

Core functionality for the waltlabtools module.

Everything in waltlabtools.core is automatically imported with waltlabtools, so it can be accessed via, e.g.,

import waltlabtools as wlt  # waltlabtools main functionality

my_data = wlt.flatten([[[[[0], 1], 2], 3], 4])  # flatten a list

waltlabtools.core.aeb(fon_)[source]

Compute the average number of enzymes per bead.

Converts the fraction of on-beads (fon) to the average number of enzymes per bead (AEB) using Poisson statistics. The formula used is aeb_ = -log(1 - fon_).

Parameters:: fon (numeric or array-like) – A scalar or array of fractions of beads which are “on.”
Returns:: aeb_ – The average number of enzymes per bead.
Return type:: same as input, or array

See also

fon: inverse of aeb

waltlabtools.core.c4(n)[source]

Factor c4 for unbiased estimation of normal standard deviation.

For a finite sample, the sample standard deviation tends to underestimate the population standard deviation. See, e.g., https://www.spcpress.com/pdf/DJW353.pdf for details. Dividing the sample standard deviation by the correction factor c4 gives an unbiased estimator of the population standard deviation. This correction factor should be applied on top of Bessel’s correction, so n-1 is used as the degrees of freedom.

Parameters:: n (int or array) – The number of samples.
Returns:: The correction factor, usually written c4 or b(n).
Return type:: numeric or array

See also

std: unbiased standard deviation
numpy.std: standard deviation
lod: limit of detection

waltlabtools.core.coerce_array(func: Callable) → Callable[source]

Coerce the argument to an array upon TypeError.

This decorator is intended to wrap functions that are primarily called with a first argument that is compatible with numpy.array. If calling the function results in a TypeError, it tries coercing the first argument to a numpy array and calling the function again. This is particularly useful for functions that might be passed lists or other array-like objects but are designed to work with numpy arrays.

Parameters:: func (callable) – The function to be decorated.
Returns:: The wrapped function which coerces its first argument to a numpy array upon TypeError.
Return type:: callable

Examples

>>> @coerce_array
... def add_one(arr):
...     return arr + 1

>>> add_one([1, 2, 3])
array([2, 3, 4])

waltlabtools.core.deprecate(replace_with: str | None = None) → Callable[source]: Decorator to issue a DeprecationWarning with an optional replacement function.

waltlabtools.core.dropna(*args, drop_inf: bool = False, axis: int | Iterable[int] | None = 0, common_len: int | None = None)[source]

waltlabtools.core.flatten(a, order: Literal['C', 'F', 'A', 'K'] = 'K') → ndarray[source]

Flatten almost anything into a 1-dimensional numpy array.

In simple cases, this function is a wrapper for numpy.ravel. In more complex cases, it recursively flattens nested iterables.

Parameters:

a (any) – The object to be flattened.
order ({'C', 'F', 'A', 'K'}, optional) – The elements of a are read using this index order. ‘C’ means to index the elements in row-major, C-style order, with the last axis index changing fastest, back to the first axis index changing slowest. ‘F’ means to index the elements in column-major, Fortran-style order, with the first index changing fastest, and the last index changing slowest. Note that the ‘C’ and ‘F’ options take no account of the memory layout of the underlying array, and only refer to the order of axis indexing. ‘A’ means to read the elements in Fortran-like index order if a is Fortran contiguous in memory, C-like order otherwise. ‘K’ means to read the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, ‘C’ index order is used.

Returns:

A 1-dimensional numpy array containing the flattened elements.

Return type:

numpy.ndarray

See also

numpy.ravel: Flatten a numpy array.

Notes

When jax has been loaded as a backend, this function will raise an error if a has non-numerical elements.

Examples

>>> a = [[1, 2, 3], [4, 5, 6]]
>>> flatten(a)
array([1, 2, 3, 4, 5, 6])

>>> b = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
>>> flatten(b)
array([1, 2, 3, 4, 5, 6, 7, 8])

waltlabtools.core.fon(aeb_)[source]

Compute the fraction of beads which are on.

Converts the average enzymes per bead (AEB) to the fraction of on-beads (fon) using Poisson statistics. The formula used is fon_ = 1 - exp(-aeb_).

Parameters:: aeb (numeric or array-like) – A scalar or array of the average number of enzymes per bead.
Returns:: fon_ – The fractions of beads which are “on.”
Return type:: same as input, or array

See also

aeb: inverse of fon

waltlabtools.core.geothmetic_meandian(a, weights=None, rtol: float = 1e-05, atol: float = 1e-08, nan_policy='omit') → float[source]

Compute he Geothmetic Meandian of the input data, as per XKCD.

For details, see https://xkcd.com/2435/. This function compares the three most common measures of central tendency: the arithmetic mean, the geometric mean, and the median. The geometric meandian uses an iterative process that stops when the arithmetic mean, geometric mean, and median converge within (atol + rtol * their magnitudes).

Parameters:

data (array_like) – The input data, which can be any array-like object.
rtol (float, default 1e-05) – The relative tolerance parameter used to determine convergence.
atol (float, default 1e-08) – The absolute tolerance parameter used to determine convergence.

Returns:

The Geothmetic Meandian of the input data.

Return type:

float

Examples

>>> geothmetic_meandian([1, 1, 2, 3, 5])
2.089440951883

waltlabtools.core.match_kwargs(func: Callable | Iterable[Callable] | str, kwargs: dict[str, Any]) → dict[str, Any][source]

Match keyword arguments to the parameters of a given function.

Given a function and a dictionary of keyword arguments, return a dictionary of only those keyword arguments that match the parameters of the function. Typical usage is

Parameters:

func (callable or iterable of callable) – The function or functions to match keyword arguments against. If multiple functions, keywords will be matched against any of them (i.e., the union of all the functions’ parameter names).
kwargs (dict[str, Any]) – The dictionary of keyword arguments to match.

Returns:

A dictionary of keyword arguments that match the parameters of the function, or the parameters of any of the functions.

Return type:

dict

Notes

The implementation of this function is based on inspect.signature. Some functions accept a generic **kwargs parameter, which is not included in the signature. This function does not handle such cases.

Examples

To call a function func on only the kwargs it accepts:

>>> func(..., **_match_kwargs(func, kwargs))

waltlabtools.core.std(a, corr: str | int = 'c4', axis: int | tuple[int, ...] | None = None, **kwargs)[source]

Compute (an unbiased estimate of) the standard deviation.

For a finite sample, the sample standard deviation tends to underestimate the population standard deviation. This function divides the sample standard deviation by the correction factor c4 to give an unbiased estimator of the population standard deviation.

Parameters:

a (array-like) – The array of values.
corr ({"n", "n-1", "n-1.5", "c4"} or numeric, default "c4") –
The sample standard deviation under-estimates the population standard deviation for a normally distributed variable. Specifies how this should be addressed. Options:
- ”n” : Divide by the number of samples to yield the uncorrected sample standard deviation.
- ”n-1” : Divide by the number of samples minus one to yield the square root of the unbiased sample variance.
- ”n-1.5” : Divide by the number of samples minus 1.5 to yield the approximate unbiased sample standard deviation.
- ”c4” : Divide by the correction factor to yield the exact unbiased sample standard deviation.
- If numeric, gives the delta degrees of freedom.
axis (None or int or tuple of ints, optional) – Axis or axes along which the standard deviation is computed. The default is to compute the standard deviation of the flattened array. If this is a tuple of ints, a standard deviation is performed over multiple axes, instead of a single axis or all the axes as before.

Returns:

The unbiased standard deviation.

Return type:

numeric

See also

numpy.std: standard deviation
c4: correction factor used when corr=”c4”

waltlabtools.model module

waltlabtools.model.MODELS: dict[str, Model] = {'3PL': <waltlabtools.model.Model object>, '4PL': <waltlabtools.model.Model object>, '5PL': <waltlabtools.model.Model object>, 'Hill': <waltlabtools.model.Model object>, 'linear': <waltlabtools.model.Model object>, 'logistic': <waltlabtools.model.Model object>, 'power': <waltlabtools.model.Model object>}

Built-in regression models.

Keys are strings giving model names; values are waltlabtools.Model objects.

Models

“linear” : Linear function.

“power” : Power function.

“Hill” : Hill function.

“logistic” : Logistic function.

“3PL” : Four-parameter logistic (3PL) function.

“4PL” : Four-parameter logistic (4PL) function.

“5PL” : Five-parameter logistic (5PL) function.

class waltlabtools.model.Model(func: Callable, inverse: Callable, coef_init: Any, name: str = '', plaintext_formula: str = '', xscale: str | ScaleBase = 'linear', yscale: str | ScaleBase = 'linear', jac: str | Callable = '2-point')[source]

Bases: object

Mathematical model for calibration curve fitting.

A Model is an object with a function and its inverse, with one or more free parameters that can be fit to calibration curve data.

Parameters:

func (function) – Forward functional form, mapping levels (e.g., concentrations) to signal values (e.g., AEB). Should be a function which takes in X and other parameters and returns y. The first parameter of func should be X, and the remaining parameters should be the coefficients which are fit to the data (typically floats).
inverse (function) – Inverse functional form, mapping signal values (e.g., AEB) to levels (e.g., concentrations). Should be a function which takes in y and other parameters and returns X. The first parameter of inverse should be y, and the remaining parameters should be the same coefficients as in fun.
name (str) – The name of the function. For example, “4PL” or “linear”.
params (list-like of str) – The names of the parameters for the function. This should be the same length as the number of arguments which fun and inverse take after their inputs x and y, respectively.
xscale ({"linear", "log", "symlog", "logit"} or)
yscale ({"linear", "log", "symlog", "logit"} or)
matplotlib.ScaleBase – The natural scaling transformations for x and y. For example, “log” means that the data may be distributed log-normally and are best visualized on a log scale.
"linear" (default) – The natural scaling transformations for x and y. For example, “log” means that the data may be distributed log-normally and are best visualized on a log scale.
jac ({'2-point', '3-point', 'cs', callable}, optional) – Method of computing the Jacobian matrix (an m-by-n matrix, where element (i, j) is the partial derivative of f[i] with respect to x[j]) of the loss function. The keywords select a finite difference scheme for numerical estimation. The scheme ‘3-point’ is more accurate, but requires twice as many operations as ‘2-point’ (default). The scheme ‘cs’ uses complex steps, and while potentially the most accurate, it is applicable only when func correctly handles complex inputs and can be analytically continued to the complex plane. If callable, it is used as jac(x, *args, **kwargs) and should return a good approximation (or the exact value) for the Jacobian as an array_like (np.atleast_2d is applied), a sparse matrix (csr_matrix preferred for performance) or a scipy.sparse.linalg.LinearOperator.

waltlabtools.model.five_param_logistic(X, a: float = 0, b: float = 1, c: float = 1, d: float = 30, g: float = 1)[source]

waltlabtools.model.five_param_logistic_inverse(y, a: float = 0, b: float = 1, c: float = 1, d: float = 30, g: float = 1)[source]

waltlabtools.model.four_param_logistic(X, a: float = 0, b: float = 1, c: float = 1, d: float = 30)[source]

waltlabtools.model.four_param_logistic_inverse(y, a: float = 0, b: float = 1, c: float = 1, d: float = 30)[source]

waltlabtools.model.hill(X, a: float = 1, b: float = 1, c: float = 1)[source]

waltlabtools.model.hill_inverse(y, a: float = 1, b: float = 1, c: float = 1)[source]

waltlabtools.model.jac_five_param_logistic(coef: ndarray, *, X: ndarray, sample_weight: ndarray) → ndarray[source]

waltlabtools.model.jac_four_param_logistic(coef: ndarray, *, X: ndarray, sample_weight: ndarray) → ndarray[source]

waltlabtools.model.jac_hill(coef: ndarray, *, X: ndarray, sample_weight: ndarray) → ndarray[source]

waltlabtools.model.jac_linear(coef: ndarray, *, X: ndarray, sample_weight: ndarray) → ndarray[source]

waltlabtools.model.jac_logistic(coef: ndarray, *, X: ndarray, sample_weight: ndarray) → ndarray[source]

waltlabtools.model.jac_power(coef: ndarray, *, X: ndarray, sample_weight: ndarray) → ndarray[source]

waltlabtools.model.jac_three_param_logistic(coef: ndarray, *, X: ndarray, sample_weight: ndarray) → ndarray[source]

waltlabtools.model.linear(X, a: float = 1, b: float = 0)[source]

waltlabtools.model.linear_inverse(y, a: float = 1, b: float = 0)[source]

waltlabtools.model.logistic(X, a: float = 1, b: float = 1, c: float = 0, d: float = 0)[source]

waltlabtools.model.logistic_inverse(y, a: float = 1, b: float = 1, c: float = 0, d: float = 0)[source]

waltlabtools.model.power(X, a: float = 1, b: float = 1)[source]

waltlabtools.model.power_inverse(y, a: float = 1, b: float = 1)[source]

waltlabtools.model.three_param_logistic(X, a: float = 0, c: float = 1, d: float = 30)[source]

waltlabtools.model.three_param_logistic_inverse(y, a: float = 0, c: float = 1, d: float = 30)[source]

waltlabtools.mosaic module

Functions for analyzing MOSAIC data.

In addition to the dependencies for waltlabtools, waltlabtools.mosaic also requires pandas 0.25 or greater and scikit-learn 0.21 or greater.

The public functions in waltlabtools.mosaic can be accessed via, e.g.,

import waltlabtools as wlt  # waltlabtools main functionality
import waltlabtools.mosaic  # for analyzing MOSAIC assays

subset_data = wlt.mosaic.plate_subsets()  # analyze data from a plate

if also using other functionality from the waltlabtools package, or

from waltlabtools import mosaic  # for analyzing MOSAIC assays

subset_data = mosaic.plate_subsets()  # analyze data from a plate

if using only the waltlabtools.mosaic module.

class waltlabtools.mosaic.PlateFileCollection(dir_path=None)[source]

Bases: object

Collection of RCA product fluorescence intensity files.

A PlateFileCollection object is a container for MOSAIC files. It is used to keep all wells from a given day, calibration curve, or assay together.

Parameters:: dir_path (str, optional) – The directory containing the MOSAIC plate files. If not provided a file dialog will be opened to select the folder.

name

The name of the folder containing the MOSAIC plate files.

Type:: str

wells

The wells contained in the MOSAIC plate files, e.g., “A1”.

Type:: list

file_map

A dictionary mapping the well position to the file path for the corresponding flow cytometry output file.

Type:: dict

conc_map

A dictionary mapping the well position to the known concentration of the calibrator.

Type:: dict

dir_path: The path to the folder containing the MOSAIC plate files.

waltlabtools.mosaic.extended_coefs(concs, aebs, corr='c4', cal_curve=None) → dict[source]

Calculates the coefficients for a 4PL model.

Parameters:

concs (array-like) – The concentrations for the calibrators.
aebs (array-like) – The AEBs for the calibrators.
corr ({"n", "n-1", "n-1.5", "c4"} or numeric, default "c4") –
The sample standard deviation under-estimates the population standard deviation for a normally distributed variable. Specifies how this should be addressed. Options:
- ”n” : Divide by the number of samples to yield the uncorrected sample standard deviation.
- ”n-1” : Divide by the number of samples minus one to yield the square root of the unbiased sample variance.
- ”n-1.5” : Divide by the number of samples minus 1.5 to yield the approximate unbiased sample standard deviation.
- ”c4” : Divide by the correction factor to yield the exact unbiased sample standard deviation.
- If numeric, gives the delta degrees of freedom.
cal_curve (CalCurve, optional) – A CalCurve object to use. If not provided, a new CalCurve will be calculated based on the concs and aebs provided.

Returns:

A dictionary of coefficients and properties of the calibration curve. Its elements are:

”a”, “b”, “c”, “d” : coefficients of the 4PL fit

”LOD” : limit of detection

”LLOQ” : lower limit of quantification (10 standard deviations above background)

”blank mean” : mean AEB at 0 concentration

”blank std” : standard deviation of AEB at 0 concentration

”blank cv” : coefficient of variation of AEB at 0 concentration

Return type:

dict

waltlabtools.mosaic.log_transform(flat_data: ndarray) → ndarray[source]

Log-transforms the data.

Because a few values may be negative, a constant value is calculated to add to all of the values.

Parameters:: flat_data (1D ndarray) – The data to be transformed.
Returns:: log_data – The log-transformed data.
Return type:: 1D ndarray

waltlabtools.mosaic.mixture_aeb(flat_data: ndarray, means_init=None, flat_len=None, threshold_sds=5, reg_covar=1e-06) → float[source]

Calculates AEB based on a 2-Gaussian mixture model.

Parameters:

flat_data (1D ndarray) – Data used for fitting the mixture model. It is assumed that if the data should be log-transformed, they have already been transformed, e.g., with log_transform.
means_init (array-like of length 2, optional) – The user-provided initial means, If not provided, means are initialized as the maximum and minimum of the data.
flat_len (int, optional) – Length of flat_data. If not provided, it is calculated.
threshold_sds (numeric, default 5) – The number of standard deviations above the mean to use as a threshold for distinguishing on-beads in the case where there are very few on-beads.
reg_covar (numeric, default REG_COVAR) – Non-negative regularization added to the diagonal of covariance. Allows to assure that the covariance matrices are all positive.

Returns:

AEB, based on weighted average of two measures of calculating on-fraction.

Return type:

float

waltlabtools.mosaic.mixture_orientation(means_: ndarray, covariances_: ndarray, threshold_sds=5) → tuple[source]

Determines which peak is ‘off.’

Parameters:

means (ndarray of length 2) – The two means of the Gaussian mixture model.
covariances (ndarray of length 2) – The two covariances of the Gaussian mixture model.
threshold_sds (float, default 5) – The number of standard deviations above the mean to use as a threshold for distinguishing on-beads in the case where there are very few on-beads.

Returns:

Tuple of length 3. Its elements:

off_label1 or 2
The label of the Gaussian mixture model with the lower mean.
sds_thresholdfloat
The number of standard deviations above the mean to use as a threshold for distinguishing on-beads in the case where there are very few on-beads.
ordered_meansndarray of length 2
The two means of the Gaussian mixture model, ordered from lowest to highest (off to on).

Return type:

off_label, sds_threshold, ordered_means

waltlabtools.mosaic.plate_subsets(dir_path=None, save_aebs_to=None, save_coefs_to=None, log: bool = True, model='4PL', lod_sds=3, subsets: int = 10, sizes=(), corr='c4', threshold_sds=5) → DataFrame[source]

Calculates AEBs and coefficients for a plate.

Parameters:

dir_path (str, optional) – The directory containing the MOSAIC plate files. If not provided a file dialog will be opened to select the folder.
save_aebs_to (str, optional) – The path to save the AEBs to.
save_coefs_to (str, optional) – The path to save the coefficients to.
log (bool, default True) – Should the data be log-transformed before fitting the Gaussian mixture model?
model (str or Model, default "4PL") – The model to fit.
lod_sds (numeric, default 3) – The number of standard deviations above the mean to use as a limit of detection.
subsets (int, default 10) – The number of subsets to create.
sizes (iterable) – The number of beads in each subset. Technically optional, but the default argument is “()” which does not conduct any subsetting.
corr ({"n", "n-1", "n-1.5", "c4"} or numeric, default "c4") –
The sample standard deviation under-estimates the population standard deviation for a normally distributed variable. Specifies how this should be addressed. Options:
- ”n” : Divide by the number of samples to yield the uncorrected sample standard deviation.
- ”n-1” : Divide by the number of samples minus one to yield the square root of the unbiased sample variance.
- ”n-1.5” : Divide by the number of samples minus 1.5 to yield the approximate unbiased sample standard deviation.
- ”c4” : Divide by the correction factor to yield the exact unbiased sample standard deviation.
- If numeric, gives the delta degrees of freedom.

Returns:

A dataframe of coefficients and extended parameters for each subset.

Return type:

pd.DataFrame

See also

extended_coefs: for the coefficients and parameters returned

waltlabtools.mosaic.weighted_aeb(onfrac_gmm, onfrac_sds)[source]

Weighted average of the two measures of fraction on.

Calculates the AEB using a weighted average of the Gaussian mixture model fraction on and the standard deviation-based fraction on.

Parameters:

onfrac_gmm (float) – The on-fraction of the Gaussian mixture model.
onfrac_sds (float) – The on-fraction based on the standard deviation threshold.

Returns:

aeb – The average number of enzymes per bead.

Return type:

float

See also

aeb: calculate the AEB from a single f_on value

waltlabtools.mosaic.well_to_aeb(well_entry=None, log: bool = True, threshold_sds=5)[source]

Calculates AEB of a well based on a 2-Gaussian mixture model.

Parameters:

well_entry (str, iterable, or os.DirEntry, optional) – The well to be analyzed. If not provided, a filedialog will open to select the output file for a well.
log (bool, default True) – Should the data be log-transformed before fitting the Gaussian mixture model?
threshold_sds (numeric, default 5) – The number of standard deviations above the mean to use as a threshold for distinguishing on-beads in the case where there are very few on-beads.

Returns:

AEB, based on weighted average of two measures of calculating on-fraction. If multiple well_entry values are provided, a dictionary is returned with the AEB values for each well.

Return type:

float or dict

waltlabtools.read module

class waltlabtools.read.HDX(io=None, raw: DataFrame | Iterable[DataFrame | Series] | None = None, cal_curves: Series | None = None, **kwargs)[source]

Bases: object

Quanterix HD-X file reader.

Reads in data from one or more Quanterix HD-X run histories (preferred) or sample results reports. The data are combined into a pandas.DataFrame called raw, and the assays are identified. From there, calibration curves can be fit to the data, and the data can be tidied.

Parameters:: io (Any, optional) – Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.csv. If you want to pass in a path object, crawl accepts any os.PathLike. By file-like object, we refer to objects with a read() method, such as a file handle (e.g. via builtin open function) or StringIO. If the path of a directory is passed, it will be traversed and a list of its constituent files will be returned. If a list, tuple, set, dictionary, or generator is passed, then each element of the collection will be crawled. If not provided, a dialogue box will open, asking the user to select files.

Examples

To read in and combine Quanterix run histories and sample results reports chosen from a dialogue box, call without any arguments:

>>> q = wlt.HDX()

property cal_curves: Series

calculate_cal_curves(model: str | Model | dict | Series = '4PL', X_name: str = 'Replicate Conc.', y_name: str = 'Replicate AEB', force: bool = False, include_assays: Iterable | None = None, exclude_assays: Iterable | None = None, **kwargs) → Series[source]

calculate_tidy(stat: str | Callable = 'median', colname: str | None = None, use_curves: bool = False, **kwargs)[source]

property tidy: DataFrame

waltlabtools.read.crawl(io) → list[source]

Traverses directories and iterables to assemble a list of files.

If a filepath (e.g., a string or an os.PathLike) or buffer is passed, it will be returned as a list. If a directory path is passed, then a list of all of its constituent files will be returned. Finally, if a collection of filepaths, buffers, or dictionaries is passed, then a list of all of their constituent files will be returned.

Parameters:: io (Any) – Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.csv. If you want to pass in a path object, crawl accepts any os.PathLike. By file-like object, we refer to objects with a read() method, such as a file handle (e.g. via builtin open function) or StringIO. If the path of a directory is passed, it will be traversed and a list of its constituent files will be returned. If a folder or collection of files is passed, then each element of the collection will be crawled.
Returns:: A list of files or buffers. If io is a directory, then this list contains all files inside of it and its subfolders. If io is a collection of files, buffers, and directories, then this list contains all files and buffers in the collection, and all files inside of directories in the collection.
Return type:: list

waltlabtools package

Submodules

waltlabtools.cal_curve module

waltlabtools.core module

waltlabtools.model module

Models

waltlabtools.mosaic module

waltlabtools.read module

Module contents