Concepts: Linear Compensation (Unmixing)¶
Fluorescent proteins used in synthetic biology often have broad emission spectra. This means the light emitted by one protein (e.g., EYFP) might be detected not only in its primary channel (e.g., FITC-A) but also spill over into channels intended for other proteins (e.g., PE-A). This phenomenon is called spectral overlap or bleed-through.
Linear compensation (often called unmixing) is the process of computationally correcting for this overlap to estimate the true signal originating from each individual fluorophore.
The Linear Model¶
The core assumption of linear compensation (aka "unmixing") is that the total signal measured in a particular detector channel (Yi) is a linear combination of:
- Autofluorescence (
Ai): Background signal from the cell itself in that channel, independent of the expressed fluorescent proteins. - Primary Signal: The signal from the fluorophore primarily intended for that channel (
fi). - Bleed-through: Contributions from all other fluorophores (
fj) spilling into channeli.
This is typically expressed mathematically as:
Yi = Ai + S_ii * fi + Σ(S_ji * fj) for j ≠ i
Where:
Yi: Measured fluorescence intensity in channeli.Ai: Autofluorescence component in channeli.fi: True, unobserved fluorescence intensity originating only from the protein primarily associated with channeli.fj: True, unobserved fluorescence intensity originating only from the protein primarily associated with channelj.S_ji: The spillover coefficient, representing the fraction of signal from proteinjthat is detected in channeli.S_iiis typically normalized to 1 (or represents the primary signal efficiency).
We usually use a matrix representation for this model, where:
Yis the vector of observed fluorescence intensities across all channels.Ais the vector of autofluorescence intensities across all channels.Sis the spillover matrix, whereS_ijis the contribution of proteinj's signal to channeli's measurement.Xis the vector of true underlying fluorescence signals (the quantities of each protein).
Which gives us the equation:
Y = S * X + A
The goal of unmixing is to estimate X (the quantities of each protein), from Y (the observations in each channel) and estimates of A and S:
X = S⁻¹ * (Y - A)
Where S⁻¹ is the inverse of the spillover matrix. This equation allows us to estimate the true abundances of each protein in the sample.
How LinearCompensation Works¶
The calibrie.LinearCompensation task implements this process:
-
Requires Context: It needs
controls_values,controls_masks,channel_names,protein_names,reference_channels(usually fromLoadControls). It also benefits from channel quality metrics (channel_signal_to_noise,channel_specificities) if available. -
Estimate Autofluorescence (
A): It calculates the median signal in each channel from the 'blank' control samples -
Estimate Spillover Matrix (
S): This is the crucial step. It uses the single-color controls. For each control (e.g., cells expressing only EYFP):- It subtracts the estimated autofluorescence (
A) from the measurements. - It assumes that the signal in the reference channel for that protein (e.g., FITC-A for EYFP, defined in
LoadControls) is proportional to the true amount of that protein (X_EYFP). - It calculates how much signal appears in other channels (e.g., PE-A) relative to the signal in the reference channel. This ratio gives the spillover coefficient (e.g.,
S_EYFP_to_PE-A). - This is done using least-squares regression. The default
mode='weighted'option uses channel quality metrics (SNR, specificity) provided byLoadControlsto give more importance to reliable channels when calculating these coefficients, making the spillover estimation more robust.
- It subtracts the estimated autofluorescence (
-
Produces Context:
autofluorescence: The estimated autofluorescence vectorA.spillover_matrix: The calculated spillover matrixS.channel_weights: The weights used ifmode='weighted'.controls_abundances_AU: Ifunmix_controls=True, the control data itself, unmixed.F_mat: A matrix representation suitable for potential downstream non-linear steps (though not typically used if only linear compensation is performed).
-
Processing Samples: In its
processmethod, it takes raw sample observations (observations_raw), subtracts the storedautofluorescence, and solves the linear system (X = S⁻¹ * Y') using the storedspillover_matrixto produceabundances_AU(abundances in Arbitrary Units).
Linear compensation is a fundamental step to disentangle mixed fluorescence signals, providing a more accurate estimate of the contribution of each individual fluorescent reporter before attempting any further normalization or calibration. The main end result is a set of abundances in Arbitrary Units (AU) for each protein in the sample, which can be used for downstream analysis or visualization.