Code

Tomographer

class tomography.Tomographer

Main object to work with tomographic reconstruction problems

The analysis steps are:

  1. First load the defaults

    • tg = Tomographer()

    • tg.load_cfg(“config_file.hdf5”) # or set this manually

    • tg.prepare_design()

  2. Then load the data

    • tg.connect_data(“data.hdf5”)

  3. Finally, reconstruct one gene

    • tg.reconstruct(gene=”Lmx1a”)

connect_data(datafile: str) → None

Load hdf5 dataset

load_cfg(cfg: Union[str, tomography.defaults.ReconstructionConfig]) → None

Load defaults from file :param defaults: :type defaults: filename (str) or ReconstructionDefaults (object)

reconstruct(gene: str, alpha_beta: Union[str, Tuple[float, float]] = 'auto', warm_start: bool = False, crossval_kwargs: Dict[str, Any] = {}, nb_r=0.3, kfolds=3) → numpy.ndarray

Perform reconstruction of a gene

Parameters
  • gene (str) – gene name as in datafile

  • alpha_beta (str or tuple) –

    options are:

    ”auto”: it will try to use the alpha and beta parameters present in the data file, if none is provided it will perform crossvalidation “crossvalidation”: it will force crossvalidation (alpha, beta): passing a tuple will perform the reconstruction using this parameters

  • warm_start – whether to use the previous reconstruction result as warm start. NOTE what the previous reconstruction is might change depending the alpha_beta parameter

Returns

reconstructed – the reconstructed signal already reshaped

Return type

np.ndarray (2d)

NOTE you can also access the recosntruction parameters using the attribute .reconstructor

Core

tomography.core.build_Design_Matrix(angles: numpy.ndarray, widths: List[float], mask_g: numpy.ndarray, mask_thrs: float = 0.2, notation_inverse: bool = True, return_projlen: bool = True, return_sparse: bool = False) → numpy.ndarray

Builds the regression design matrix (Projection Matrix).

Parameters
  • angles (np.ndarray) – the angles of the slicing

  • widths (list of float) – (number of pixels) real width of a slice for every cutting angle

  • mask_g (2-D array of floats) – greyscale mask reppresenting the shape of the tissue slice, works good also if entries are only 1s and 0s

  • mask_thrs (float) – value to threshold mask_g

  • notation_inverse (bool, default=True) – the angles are ment from the point of view of the image

  • return_projlen (bool, default=True) – wether to return the information about the number of rows for each angle

  • return_sparse (bool, default=False) – wether to return a scipy.sparse.csr_matrix

Returns

design_matrix

Return type

2-d array

Notes

it returns design matrix for the mask constrained regression problem the correct angles would be the one looking at the image flipped (origin at left-top positive of y is down) but if notation_inverse == True it assumes that angles are respect to a origin at the left-bottom

Assumptions: The image is reliable and the width are reliable.

This has to be adjusted beforehand

tomography.core.calculate_projection(angle: float, q: float, X_coords: numpy.ndarray, Y_coords: numpy.ndarray, slice_width: float) → numpy.ndarray

Returns a flattened array of the pixels contribution to the point projection

Parameters
  • angle (float) – The slope of the line the slide is centered upon

  • q (float) – The q parameter of the line (y = mx +q) the slide is centered upon

  • X_coords (2-D array) – contains x at pixel (y,x)

  • Y_coords (2-D array) – contains y at pixel (y,x)

  • slice_width (float) – the width of a slice

  • sparse (bool, default=False) – wether to return a csr sparse matrix

Returns

design_matrix_row

Return type

1-D array

Notes

What is returned by this function correspond to a row of the Design/Projection matrix. To be more precise this is the raveled form of a matrix that contains: in position (r,c) the contribution of the (r,c) pixel to the point projection.

In practice this is calculated with an heuristic function. contribution around the line distributes as a linear (triangular) kernel the line y = tan(alpha)*x + q If plotted would look like a pixelated line (y = tan(alpha) + q) that skips all the zero point of the image mask

The reason why this is used and prefered over other kernels is that this is robust over summation. This means that summing all the pixel contributions one gets back the original grayscale image mask.

tomography.core.create_connectivity(mask: numpy.ndarray, kind: str = 'queen') → numpy.ndarray

Create a connectivity matrix of the pixels in a image

Parameters
  • mask (np.2darray) – Square image of side N

  • kind (str default 'queen) – The kind of connectivity to apply. Can be: rook, bishop, queen (as in chess)

Returns

connectivity_matrix – A connectivity matrix (N^2, N^2) where N is the side of mask

Return type

np.ndarray

tomography.core.distance_point_line(point: Tuple[float, float], line: Tuple[float, float, float]) → float

Calculate the distance between a point and a line

Parameters
  • point (touple of float) – Coordinates (x, y) of the point

  • line (touple of float) – Parameters (a,b,c) where the line is ax + by + c = 0

Returns

distance – Euclidean distance between the point and the line

Return type

float

tomography.core.find_extremities(image: numpy.ndarray, angle: float) → Tuple[numpy.ndarray, numpy.ndarray]

Returns the first and last point encoutered by slicing the mask image with angle angle

Parameters
  • image (2-D array) – the convex_hull of the mask

  • angle (float) – (in radiants) the angle of the slicing with respect to x (conter clockwise)

Returns

first_point, last_point – first and last point touched rotating counterclockwise (clockwise if looking at the image) direction

Return type

arrays

Notes

Given a slicing angle and a mask it returns the first and last point encoutered by the slicing. We use a rotation trick to do this in the fastest and most robust way

Timing: 2.75 ms

tomography.core.linear_kernel(D: Union[numpy.ndarray, float], w: float) → numpy.ndarray

Returns the linear kernel at a point distant D from the max

Parameters
  • D (ndarray or float) – distance from the center(max) of the kernel

  • w (float) – half width of the wavelet (distance from max to zero) for d > w the kernel evaluate to 0

Returns

K – Array containg at K[i,j] the value of the kernel function at D[i,j]. For definition of kernel it is 0<=K[i,j]<=1

Return type

ndarray or float

tomography.core.place_inside_mask(values: numpy.ndarray, mask_bw: numpy.ndarray) → numpy.ndarray

Place the values at the position that are 1/True in mask followin the C_CONTIGUOUS enumeration order

Parameters
  • values (np.ndarray (1d, float)) – the vaues to fill in mask

  • mask_bw (np.ndarray (2d, binary or boolean)) – the mask to fill values in

Returns

x – 2d array with the values subsittuted in the right place of the mask

Return type

np.ndarray 2d

tomography.core.prepare_design_masked(D: numpy.ndarray, mask_bw: numpy.ndarray) → numpy.ndarray

Design matrix using only the pixels in the mask

Parameters
  • D (np.ndarray) – The design matrix as it comes from build_Design_Matrix

  • mask_bw (np.ndarray (binary or boolean)) – The image used as mask. Entryes should be only 1/0 or True/False

Returns

Return type

New design matrix with appended the simmetric angles. And where only the pixels in the mask are considered

tomography.core.prepare_design_symmetry(D: numpy.ndarray) → numpy.ndarray

Add to the design matrix strip simmetric to the one in the input

Parameters

D (np.ndarray) – The design matrix as it comes from build_Design_Matrix

Returns

Return type

New design matrix with appended the simmetric angles

tomography.core.prepare_design_symmetry_masked(D: numpy.ndarray, mask_bw: numpy.ndarray) → numpy.ndarray

Design matrix using only the pixels in the mask and adding strips simmetric to the one in the input

Parameters
  • D (np.ndarray) – The design matrix as it comes from build_Design_Matrix

  • mask_bw (np.ndarray (binary or boolean)) – The image used as mask. Entryes should be only 1/0 or True/False

Returns

Return type

New design matrix with appended the simmetric angles. And where only the pixels in the mask are considered

tomography.core.prepare_observations(projections: List[numpy.ndarray], xs: List[numpy.ndarray], first_points: List[int], projs_len: List[int], interpolation: str = 'linear', verbose: bool = False) → numpy.ndarray

Prepare the observation vector b

Parameters
  • projections (list of 1-D array) – a list of projection. Projections are first selected so that the first value is the first reliable section and the last the last reliable section

  • xs (list of 1-D array) – Contains arrays indicating wich are indexes of the ‘projections’ input. projections are usually filtered and trimmed, so an ix array is kept to keep track of it. Its values[i] usually gets filtered and some samples are missing. e.g. [ array([20,21,22,24,25,27,30,…]), array([10,11,12,15,16,17,18,…]), array([3,4,7,9,10,11,12,…]) ]

  • first_points (list of int) – for every proj-angle it indicates how shifted it is from the theoretical one

  • projs_len (list of int) – the expected number of slices that should be taken in account starting from list_of_points[i]

  • interpolation (str) – kind interpolation one of “linear”, “cubic”, “mixed”

  • verbose (bool) – prints min(xi), max(xi), max(xi)-min(xi)+1, n_points, len(xi), len(p)

Returns

final_proj – The projections ready to be given as an imput of a regression problem

Return type

1-D array

tomography.core.prepare_observations_symmetry(projections: List[numpy.ndarray], xs: List[numpy.ndarray], first_points: List[int], projs_len: List[int], interpolation: str = 'linear', verbose: bool = False) → numpy.ndarray

Prepare the observation vector b assuming symmetry. It will will copy the observations at one angle so to assume that the projection at the symmetrical is the same

Parameters
  • projections (list of 1-D array) – a list of projection. Projections are first selected so that the first value is the first reliable section and the last the last reliable section

  • xs (list of 1-D array) – Contains arrays indicating wich are indexes of the ‘projections’ input. projections are usually filtered and trimmed, so an ix array is kept to keep track of it. Its values[i] usually gets filtered and some samples are missing. e.g. [ array([20,21,22,24,25,27,30,…]), array([10,11,12,15,16,17,18,…]), array([3,4,7,9,10,11,12,…]) ]

  • first_points (list of int) – for every proj-angle it indicates how shifted it is from the theoretical one

  • projs_len (list of int) – the expected number of slices that should be taken in account starting from list_of_points[i]

  • interpolation (str) – kind interpolation one of “linear”, “cubic”, “mixed”

  • verbose (bool) – prints min(xi), max(xi), max(xi)-min(xi)+1, n_points, len(xi), len(p)

Returns

final_proj – The projections ready to be given as an imput of a regression problem

Return type

1-D array

tomography.core.prepare_regression(projections: List[numpy.ndarray], xs: List[numpy.ndarray], design_matrix: numpy.ndarray, first_points: List[int], projs_len: List[int], verbose: bool = True) → Tuple[numpy.ndarray, numpy.ndarray]

Prepare Design matrix and observation vector

Parameters
  • projections (list of 1-D array) – a list of projection. Projections are first selected so that the first value is the first reliable section and the last the last reliable section

  • xs (list of 1-D array) – Contains arrays indicating wich are indexes of the ‘projections’ input. projections are usually filtered and trimmed, so an ix array is kept to keep track of it. Its values[i] usually gets filtered and some samples are missing. e.g. [ array([20,21,22,24,25,27,30,…]), array([10,11,12,15,16,17,18,…]), array([3,4,7,9,10,11,12,…]) ]

  • design_matrix (2-D array) – as calculated by the fucntion build_Design_Matrix

  • first_points (list of int) – for every proj-angle it indicates how shifted it is from the theoretical one

  • projs_len (list of int) – the expected number of slices that should be taken in account starting from list_of_points[i]

  • verbose (bool) – prints min(xi), max(xi), max(xi)-min(xi)+1, n_points, len(xi), len(p)

Returns

  • D (2-D array) – The design matrix ready to be given as as a input of a regression problem

  • final_proj (1-D array) – The projections ready to be given as an imput of a regression problem

Notes

The function includes a mixed cubic-linear-zero interpolation to fill in the missing values. This is necessary because if one instead omits the equations for the missing projection (as it would be intuitive) the regularized problem will have less constrains and the respective pixel will be set to zero or very low numbers. Input image given to Design Matrix function has to be symmetrical.

tomography.core.prepare_regression_symmetry(projections: List[numpy.ndarray], xs: List[numpy.ndarray], design_matrix: numpy.ndarray, first_points: List[int] = [0, 0, 0], projs_len: List[int] = [100, 100, 100], verbose: bool = True) → Tuple[numpy.ndarray, numpy.ndarray]

Currently the best algorythm for reconstruction explointing the simmetry of the input image.

Parameters
  • projections (list of 1-D array) – a list of projection. Projections are first selected so that the first value is the first reliable section and the last the last reliable section

  • xs (list of 1-D array) – Contains arrays indicating wich are indexes of the ‘projections’ input. projections are usually filtered and trimmed, so an ix array is kept to keep track of it. Its values[i] usually gets filtered and some samples are missing. e.g. [ array([20,21,22,24,25,27,30,…]), array([10,11,12,15,16,17,18,…]), array([3,4,7,9,10,11,12,…]) ]

  • design_matrix (2-D array) – as calculated by the fucntion build_Design_Matrix

  • first_points (list of int) – for every proj-angle it indicates how shifted it is from the theoretical one

  • projs_len (list of int) – the expected number of slices that should be taken in account starting from list_of_points[i]

  • verbose (bool) – prints min(xi), max(xi), max(xi)-min(xi)+1, n_points, len(xi), len(p)

Returns

  • D (2-D array) – The design matrix ready to be given as as a input of a regression problem

  • final_proj (1-D array) – The projections ready to be given as an imput of a regression problem

Notes

The function includes a mixed cubic-linear-zero interpolation to fill in the missing values. This is necessary because if one instead omits the equations for the missing projection (as it would be intuitive) the regularized problem will have less constrained and the respective pixel will be set to zero or very low numbers. Input image given to Design Matrix function has to be symmetrical.

tomography.core.prepare_regression_symmetry_masked(projections: List[numpy.ndarray], xs: List[numpy.ndarray], design_matrix: numpy.ndarray, mask: numpy.ndarray, first_points: List[int], projs_len: List[int], verbose: bool = True) → Tuple[numpy.ndarray, numpy.ndarray]

Currently the best and faster algorythm for reconstruction explointing the simmetry of the input image.

Parameters
  • projections (list of 1-D array) – a list of projection. Projections are first selected so that the first value is the first reliable section and the last the last reliable section

  • xs (list of 1-D array) – Contains arrays indicating wich are indexes of the ‘projections’ input. projections are usually filtered and trimmed, so an ix array is kept to keep track of it. Its values[i] usually gets filtered and some samples are missing. e.g. [ array([20,21,22,24,25,27,30,…]), array([10,11,12,15,16,17,18,…]), array([3,4,7,9,10,11,12,…]) ]

  • design_matrix (2-D array) – as calculated by the fucntion build_Design_Matrix

  • mask (2-D boolean array:) – a boolean masked indicating which pixel to reconstruct

  • first_points (list of int) – for every proj-angle it indicates how shifted it is from the theoretical one

  • projs_len (list of int) – the expected number of slices that should be taken in account starting from list_of_points[i]

  • verbose (bool) – prints min(xi), max(xi), max(xi)-min(xi)+1, n_points, len(xi), len(p)

Returns

  • D (2-D array) – The design matrix ready to be given as as a input of a regression problem

  • final_proj (1-D array) – The projections ready to be given as an imput of a regression problem

Notes

The function includes a mixed cubic-linear-zero interpolation to fill in the missing values. This is necessary because if one instead omits the equations for the missing projection (as it would be intuitive) the regularized problem will have less constrained and the respective pixel will be set to zero or very low numbers. Input image given to Design Matrix function has to be symmetrical.

tomography.core.slicing_parameters(angle: float, extremity_points: Tuple[numpy.ndarray, numpy.ndarray]) → Tuple[float, Tuple[float, float]]

Calculates all the parameters necessary for slicing.

Parameters
  • angle (float) – angle in radiants

  • extremity_points (touple of 1-D array) – (first_point, last_point) as it is given from find_extremities()

Returns

  • m, qs (floats) – Parameters for the two tangent lines: y = m*x + qs[0] and y = m*x + qs[1]

  • TO DO (Cases here r // x and r // y)

tomography.core.sum_project(image: numpy.ndarray, angle: float, slice_width: float, notation_inverse: bool = True) → numpy.ndarray

Utility to get the projection of a greayscale image in a direction angle

Parameters
  • image (2-D array) – Grayscale image. Reppresenting a mask or a signal

  • angle (float) – In radiants, the angle of slicing

  • slice_width (float) – The width of a slice

  • notation_inverse (bool) – the angle is ment from the point of view of the image

Returns

  • projection (1-D array) – the sum of all the pixels

  • TODO (add suppport with precomputed projection matrix)

Optimization

class tomography.optimization.ReconstructorCVXPY(alpha: float = 1, beta: float = 0.01, config: tomography.defaults.ReconstructionConfig = <tomography.defaults.ReconstructionConfig object>, solver: str = 'SCS', solver_kwargs: dict = {})

Optimizer to solve constrained regularized least squares problem

Parameters
  • alpha (float, default 1.0) – Controls instensity of L1 regularization

  • beta (float, default 0.01) – controls intensity of the TV filter

  • config (ReconstructionConfig) – object containing the details of the geometry of the reconstruction

  • solver (cvxpy solver obejct, default cvxpy.SCS) – Can be changed but most performant in the task seem to be SCS

change_par(alpha: float = None, beta: float = None, A: numpy.array = None, b: numpy.array = None, b_n: float = None) → None

Change a parameter without reformulating the model.

fit(b: numpy.ndarray = None, A: numpy.ndarray = None, warm_start: bool = True) → Any

Fit method Defines the model and fit it to the data.

Parameters
  • b (np.array, dtype float) –

  • A (np.ndarray, dtype float) –

  • mask (np.ndarray, dtype int | bool) – mask_gray > 0.1

Returns

reconstructor – The object after fit. To get the data access the attribute ‘x’ otherwise call fit_predict

Return type

Recontructor

Note

b, A and mask are not required if the proble has been previously formulated (e.g. if self.warmstart = True)

fit_predict(b: numpy.ndarray = None, A: numpy.ndarray = None, warm_start: bool = True) → numpy.ndarray

Run the oprimization and return the reconstructed image. The same as self.fit but also returns the reshaped result.

optimize(b: numpy.ndarray, total_eval: int = 31, initial_grid_n: int = 4, acquisition_par: float = 2, max_time: int = None, domain: List[Dict] = None) → None

Optimize the alpha and beta parameter based on crossvalidation.

Parameters
  • b (np.array) – It should be already normalized (e.g. b/b.max())

  • total_eval (int) – total number the fucntion will be evaluated. In log2 space

  • initial_grid_n (int) – the number of parameters inside domain to be evaluated before starting the bayessian optimization

  • max_time (int) – max acceptable time to run the optimization, after this time optimization will be terminated NOTE: the use of this parameter is not reccomanded becouse it might result in output variability on different machines

  • domain (List of dict) – describing the domain to search, deafault is: [(-6, 1.8), (-4, 2.5)]

  • NOTE

  • this function the values of alpha and beta will be changed (Running) –

  • is inplemented using bayessian optimization. (This) –

class tomography.optimization.ReconstructorFastTest(alpha: float = 1, beta: float = 0.01, config: tomography.defaults.ReconstructionConfig = <tomography.defaults.ReconstructionConfig object>, solver: str = 'SCS', solver_kwargs: dict = {}, ground_truth: numpy.ndarray = None)
tomography.optimization.sum_nb_ll(b, mu, r)

compute negative binomial loglikelihood and gradient mu is b_pred = Acsc.dot(x) and b is y x

Cross Validation

tomography.crossvalidation.bool_from_interval(intervals_ixs: List[int], boundaries: numpy.ndarray, simmetry: bool = True) → numpy.ndarray

Given interval to include and boundaries returns an array that can be used for bool indexing.

Parameters
  • intervals_ixs (list) – A list of integers of which interval include, for example if intervals_ixs = [0,3] you want to include only data with ix so that boundaries[0] <= ix < boundaries[1] & boundaries[3] <= ix < boundaries[4]

  • boundaries (np.ndarray) – an array indicating the borders of the boundaries

  • simmetry (bool) – if True will adapt the result to the simmetery constrained problem

Returns

bool_filter – a boolean array that can be used for filtering

Return type

np.ndarray of bool

tomography.crossvalidation.corr_objective(y_test: numpy.ndarray, y_predicted: numpy.ndarray) → float

Return the negative correlation (to minimize)

tomography.crossvalidation.cross_validate(A: numpy.ndarray, b: numpy.ndarray, mask: numpy.ndarray, boundaries: numpy.ndarray, alpha_beta_grid: List[List[float]], score_f: Callable, reconstructor_class: Callable) → List[List[float]]

Slow but exhaustive crossvalidation by naive grid search and no optimization warmstart

Parameters
  • A (np.ndarray) – the design matrix (as returned by a variant of tomography.prepare_regression function)

  • b (np.ndarray) – the observation vector (as returned by a variant of tomography.prepare_regression function)

  • mask (np.ndarray) – grayscale mask

  • boundaries (np.ndarray) – array constaining the borders of the intervals of b corresponding to different projections (starting from 0)

  • alpha_beta_grid (List[List[float]]) – a list of list containing the alpha, beta values to try

  • score_f (Callable) – function taking two arguments (b_test, b_train) and returing the score to be calulated

  • reconstructor_class (class default(ReconstructorFast)) – should be either Reconstructor or ReconstructorFast Note: class not instance

Returns

all_scores – the result of calling score_f for every possible split for every element of the grid

Return type

List[List[float]]

tomography.crossvalidation.nb_loglik(y, mu, r)

Continuous Negative binomial loglikelihood function. Numerically stable implementation.

Parameters
  • y (float or np.ndarray) – The values to evaluate the loglikehood on

  • mu (float or np.ndarray) – The mean parameter of the negative binomial distribution, is y_predicted

  • psi (float or np.ndarray) – The psi parameter of the NB distribution It corresponds to (VAR[x] - E[x])/ E[x]**2 For a constant overdispersion r set it to r/mu

Returns

Return type

The Negative binomial LogLikelihood

Note

For more information on Continuous negative binomial likelihood function: - Robinson and Smyth, Biostatistics 2007 Stability to high/low input values has been tested manually but there are no theoretical guarantees

tomography.crossvalidation.poisson_log_lik_objective(y_test: numpy.ndarray, y_predicted: numpy.ndarray) → float

Minus Log likelihood $l(lambda;x)=sumlimits^n_{i=1}x_i ext{ log }lambda-nlambda$

tomography.crossvalidation.rmse_objective(y_test: numpy.ndarray, y_predicted: numpy.ndarray) → float

Return the residual sum of squares

tomography.crossvalidation.rss_objective(y_test: numpy.ndarray, y_predicted: numpy.ndarray) → float

Return the residual sum of squares

tomography.crossvalidation.split_list(lista: List, split_size: Tuple[int, int]) → Iterator[Sequence[Any]]

Split a list in two groups of defined size in all possible permutations of combinations

Parameters
  • lista (list) – list ot be split

  • split_size (Tuple[int, int]) – a tuple of two integers , their sum neews to be len(lista)

Returns

combinations – iterators of the possible splits for example ((1,2,3), (4,5)), ((1,2,4), (3,5)), …

Return type

Itarator[Tuple[List, List]]

Grid Search

tomography.gridsearch.equilibrate(a: numpy.ndarray, b: numpy.ndarray) → Tuple[numpy.ndarray, numpy.ndarray]

Expand the shorter of two array by zero padding

tomography.gridsearch.gridsearch_allignment(grid_angles: Iterable, grid_wids: Iterable, grid_zeroshifts: Iterable, expvalues: numpy.ndarray, xi: numpy.ndarray, reference_img: numpy.ndarray, mask: numpy.ndarray, objective: Callable = None) → pandas.core.frame.DataFrame

Perform a grid search of the optimal allignment exploring all the possible combinations of angles, widths and shifts. It uses a reference image and the expression values and compares them using a objective function

Parameters
  • grid_angles (Iterable) – Iterable of the angles to try. The angle of projection in radiant

  • grid_wids (Iterable) – Iterable of the widths to try. Width is the thikness of the strips in pixels

  • grid_zeroshifts (Iterable) – Terable of the zeroshifts to try

  • expvalues (np.ndarray) – The variable to allign

  • xi (np.ndarray) – The indexes of the variable to allign

  • reference_img (np.ndarray) – The reference image that is projected and alligned with `` expvalues

  • mask (np.ndarray) – The mask used build the design matrix

  • objective (Callable) – Function to minimize. Should take 3 arguments f(s,X,Y) where X is the measured variable Y is the reference after projecting reference_img

Returns

results – The results organized in a pandas Dataframe. with columns: “objective”, “angle”, “width”, “scaling”, “shift”

Return type

pd.Dataframe

tomography.gridsearch.mixed_interpolator(xi: numpy.ndarray, p: numpy.ndarray) → numpy.ndarray

Perform a mixed cubic/linear interporlation

Parameters
  • xi (np.ndarray) – the x position mesured/not-excluded

  • p (np.ndarray) – the detection level

Returns

interpolated_p – the interpoleted values of p

Return type

np.ndarray

tomography.gridsearch.objective_afterscaling(angle: float, width: float, experimental: numpy.ndarray, reference_img: numpy.ndarray, mask: numpy.ndarray, objective: Callable = None) → Tuple[float, float]

Find the optimal vertical scaling given the angle and the width. This solves the problem of different scaling between the reference mask (arbitrary) and the mesured expression levels (reads/UMI).

Parameters
  • angle (float) – The angle of projection in radiant

  • width (float) – The thikness of the strips in pixels

  • experimental (np.ndarray) – The mesured value for the variable

  • reference_img (np.ndarray 2d) – The reference image that prejected should give experimental

  • mask (np.ndarray) – The mask for the tissue area

  • objective (Callable default None) – objecting function that takes a scalar attribute if None: Residual Sum of Squares will be used.

Returns

  • obj_val (float) – The value of the objective function at the minimum

  • scaling (float) – The

tomography.gridsearch.zero_shifting(A: numpy.ndarray, n: int) → numpy.ndarray

Zero pad (from the left) or index A to make it fir a certain lenght

Defaults

class tomography.defaults.ReconstructionConfig(angles_names: List[str] = None, mask: numpy.ndarray = None, mask_thrs: float = 0.2, reference_mask: numpy.ndarray = None, symmetry: bool = False, masked_formulation: bool = True, angles_values: numpy.ndarray = None, first_points: List[int] = None, widths: List[float] = None)

Class to construct config files for Tomographer

Parameters
  • angle_names (List[str]) – list containing names of angles

  • mask (2D array) – contains mask of tissue

  • mask_thrs (Float, default 0.2) – threshold for reference image used to create mask

  • reference_mask (2D array) – reference image

  • symmetry (Boolean, default False) – value indicating whether design matrix should be built assuming symmetry in tissue

  • masked_formulation (Boolean, default True) – value indicating whether design matrix should be built using the mask

  • angles_values (Array) – array containing angles in radians

  • first_points (List[int]) – list specifying the starting point to begin considering values within projection for each angle

  • widths (List[float]) – list indicating the estimated widths of the secondary slices in each angle

angles_names
mask_g
mask_thrs
reference_mask
symmetry
masked_formulation
angles_values
first_points
widths
proj_len
A
to_file()

Gaussian Processes

tomography.gaussprocess.fit_heteroscedastic(y: numpy.ndarray, X: numpy.ndarray, verbose: bool = True) → sklearn.gaussian_process.gpr.GaussianProcessRegressor

Fit a Gaussian Process with RBF kernel and heteroscedastic noise level

Parameters
  • y (np.ndarray) – The target variable

  • X (np.ndarray) – The independent variables

Returns

gp_heteroscedastic – Model, already fit and ready to predict

Return type

GaussianProcessRegressor

Note

To get the average function and the predicted noise do: y_mean, y_std = gp_heteroscedastic.predict(X_new, return_std=True)

tomography.gaussprocess.normalization_factors(mu1: float, mu2: float, std1: float, std2: float, q: float = 0.15) → Tuple[float, float]

Given the average value of a gaussian process at its extremities it returns the normalization factors that the two sides needs to be multiplied to so that the probablity of the smaller extremity generating a realization bigger of the one of the bigger extremity is at least q (default 15%)

Parameters
  • mu1 (float) – estimate average of the first extremity point

  • mu2 (float) – estimate average of the second extremity point

  • std1 (float) – extimation of the standard deviation around the first extremity point

  • std2 (float) – extimation of the standard deviation around the second extremity point

Returns

  • factor1 (float) – number that should be mutliplied by the first extremity side of the function

  • factor2 (float) – number that should be mutliplied by the second extremity side of the function

tomography.gaussprocess.predict_gp_heteroscedastic(y: numpy.ndarray, X: numpy.ndarray, X_new: numpy.ndarray, verbose: bool = True) → Tuple[numpy.ndarray, numpy.ndarray]

Fits and predict an heteroscedastic-kernel gaussian process model, taking care of the normalizations

Parameters
  • y (np.ndarray) – the measurements y to fit by the gaussian process

  • X (np.ndarray) – The points at whic the measurments y are available

  • X_new (np.ndarray) – The points where to predict from the model

  • verbose (bool default True) – if you want to print info on the fit results

Returns

  • y_mean (np.ndarray) – The mean funtion of the gp at each point X_new

  • y_std (np.ndarray) – The deviation from the mean at every point X_new

tomography.gaussprocess.use_gp_to_adjust_plates_diff(data: Dict[str, pandas.core.frame.DataFrame], q: float = 0.15) → Tuple[Dict[str, Any], Dict[str, Any]]

Fits a GP to the left and right plate and corrects for different depth. It does this by matching the extremities in a conservative way uses a statistical condition.

Parameters
  • data (Dict[str, pd.DataFrame]) – Dictionary on anfles datasets

  • q (float) – Value used to decide how much to adjust the plate difference The probability that a lower margin rv is bigger than the higher margin rv 0 -> no adjustment 0.5 -> exact matching of mean functions

Returns

  • coordinates_dict (Tuple[Dict[str, Any], Dict[str, Any]]) – To pass to the plotting fucntion. It contains the values x, y, X_news, y_means, y_stds, f Where each of them is a list of two elements containin the arrays corresponding to each of the plates

  • normalizations (Dict[str, Tuple[float, float]]) – keys are angles and values are tuple containing the normalizations factors that should be multiplied with that two plates respectivelly

tomography.gaussprocess.use_gp_to_adjust_spikes_diff(spikes: Dict[str, pandas.core.frame.DataFrame], q: float = 0.15) → Tuple[Dict[str, Any], Dict[str, Any]]

Fits a GP to the left and right plate spikes and corrects to the avareage

Parameters
  • spikes (Dict[str, pd.DataFrame]) – Dictionary on anfles datasets

  • q (float) – Value used to decide how much to adjust the plate difference The probability that a lower margin rv is bigger than the higher margin rv 0 -> no adjustment 0.5 -> exact matching of mean functions

Returns

  • coordinates_dict (Tuple[Dict[str, Any], Dict[str, Any]]) – To pass to the plotting fucntion. It contains the values x, y, X_news, y_means, y_stds, adj Where each of them is a list of two elements containin the arrays corresponding to each of the plates

  • adjustments (Dict[str, Tuple[float, float]]) – the number that needs to be multiplied by each point to obtain the normalization

Utils

tomography.utils.IPT_2d(mask: numpy.ndarray, X: numpy.ndarray, Y: numpy.ndarray, numIter: int = 100) → numpy.ndarray

Iterative proportional fitting used by Junker et al.

tomography.utils.IPT_angular(angular_projs: List[numpy.ndarray], D: numpy.ndarray, proj_len: List[int], sigma: float = None, numIter: int = 100) → numpy.ndarray

Iterative proportional fitting adapted for angular cutting

tomography.utils.IPT_junker(mask: numpy.ndarray, X: numpy.ndarray, Y: numpy.ndarray, Z: numpy.ndarray, numIter: int = 100) → numpy.ndarray

Iterative proportional fitting used by Junker et al.

tomography.utils.colorize(image: numpy.ndarray, hue: float, saturation: float = 1) → numpy.ndarray

Add color of the given hue to an RGB image.

By default, set the saturation to 1 so that the colors pop!

Parameters
  • image (np.ndarray) – origina image in RGB

  • hue (float) – the hue we want to set

  • saturation (float) – the saturation we want to set

Returns

the new image after the transformation

Return type

new_image

tomography.utils.get_plate(df: pandas.core.frame.DataFrame) → numpy.ndarray

Extract for each well the 96 well plate it comes from

Parameters

df (pd.Dataframe) – Pandas dataframe that contains as columns formatted as follow angle{degrees}_{plate_num}_{well_num}_x{pos}

Returns

plate_num – the plate id of each sample

Return type

np.ndarray

tomography.utils.get_x(df: pandas.core.frame.DataFrame) → numpy.ndarray
Parameters

df (pd.Dataframe) – Pandas dataframe that contains as columns formatted as follow angle{degrees}_{plate_num}_{well_num}_x{pos}

Returns

pos – the position on the projection axis ( orthogonal to the cutting angle) where the orientation of the axis reflects the order the strips have been cut

Return type

np.ndarray

tomography.utils.mixed_interpolator2(xi: numpy.ndarray, p: numpy.ndarray) → Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]

Perform a mixed cubic/linear interporlation

Parameters
  • xi (np.ndarray) – the x position mesured/not-excluded

  • p (np.ndarray) – the detection level

Returns

  • interpolated_p (np.ndarray) – the interpoleted values of p

  • not_provided (np.ndarray) – boolean array with the values that have been fill markerd as True

tomography.utils.normalize_AUC(data: Dict[str, pandas.core.frame.DataFrame], return_scaling_factors: bool = False) → Union[Tuple, Dict]

Normalize the datasets so that the area under the curve is the same in all projections and equal to their average.

Parameters

data (Dict[str, pd.Dataframe]) – A dictionary containing the projections datasets to normalize. Every dataframe is assumed to be not filterd (e.g. containing all the genes) and of shape (NGenes,NProjections)

Returns

  • data (Dict[str, pd.Dataframe]) – the dictionary containing the different projections normalized

  • sclaing_factors (optional) – if return_scaling_factors is True it returns the scaling factors that have been used for the normalization

Notes

It uses sympson integration instead of just summing to avoid error because of the removed bad samples. This assumes that dx is the same, even if later this will be slightly changed to best fit the projections.

Visualize

tomography.visualize.plot_plate_adjustment(coordinates_dict: Dict[str, Tuple]) → None

Plot the results of plate normalization using gaussian processes

Parameters

coordinates_dict (Dict[Tuple]) – keys should be the angles the values tuple x, y, X_news, y_means, y_stds, f where each of them is a list of two elements containin the arrays corresponding to each of the plates

tomography.visualize.plot_raw_data_sum(data: Dict[str, pandas.core.frame.DataFrame], spikes: Dict[str, pandas.core.frame.DataFrame]) → None

Plot a first diagnostic graf of the total molecules detected per projection, including spikeins

tomography.visualize.plot_spikes_adjustment(coordinates_dict: Dict[str, Tuple]) → None

Plot the results of spikes normalization using gaussian processes

Parameters

coordinates_dict (Dict[Tuple]) – keys should be the angles the values tuple x, y, X_news, y_means, y_stds, f where each of them is a list of two elements containin the arrays corresponding to each of the plates