Getting Started: A Calibration Example¶
This tutorial walks through a typical Calibrie workflow using YAML configuration files and the command-line runner. We'll calibrate a sample experiment involving multiple fluorescent proteins.
Prerequisites¶
- Install Calibrie: Follow the instructions on the Overview page.
- Example Files: You'll need experiment data (FCS or CSV files for samples and controls), bead data (FCS), and configuration files. We'll refer to the example structure provided:
Note: You'll need to create the
calibrie/ ├── CalibrationTemplates/ │ └── WeissFortessa/ │ ├── 4color_controls.yaml │ ├── beads_urcp-100-2H.yaml │ ├── ebfp2_mapping.yaml │ └── simple_linear.yaml ├── experiments/ │ └── myxp/ │ ├── calibration/ │ │ ├── gating/ │ │ │ └── gating_task.yaml │ │ └── regular.yaml │ ├── data/ # <= IMPORTANT: Assumed location for data files │ │ ├── raw_data/ │ │ │ ├── 2023-10-01_Cascades_CCv4_1.fcs # Sample │ │ │ ├── ... (other samples) ... │ │ │ ├── 2023-10-01_Cascades_CCv4_21.fcs # 1xirfp720 Control │ │ │ ├── ... (other controls) ... │ │ │ ├── 2023-10-01_Cascades_CCv4_27.fcs # CNTL Control │ │ │ └── 2023-10-01_Cascades_CCv4_beads.fcs # Beads │ └── experiment.json5 ├── calibrie/ ├── ... (other project files) ...data/raw_data/directory and place your FCS files there, matching the names inexperiment.json5.
1. Experiment Metadata (experiment.json5)¶
Calibrie uses a JSON5 file (JSON with comments and trailing commas) to store metadata about the experiment and links to the raw data files.
File: experiments/myxp/experiment.json5
Key Sections:
name: A descriptive name for the experiment.samples: A list of all runs in the experiment. Each entry has:name: Unique name for the sample/run.recipe: (Optional) Experimental condition identifier.control:1if this is a control file (single color, blank, all),0otherwise.file: The relative path (from thedatapathspecified in the run command, default is./data/raw_data/) to the FCS or CSV file.notes: (Optional) Any relevant notes.
beads_file: The relative path to the bead calibration file.- Other fields: Metadata like dates, operators, machine, cell line, etc. (Currently not used directly by the core Calibrie tasks, but good practice for record-keeping).
This file tells the calibrie-run script which files are samples and which are controls, essential for tasks like LoadControls.
2. Pipeline Definition (regular.yaml)¶
The pipeline defines the sequence of analysis steps (Tasks). Calibrie uses YAML files, often leveraging the dracon library for features like includes (*file:) and variable substitution (${...}).
File: experiments/myxp/calibration/regular.yaml
Explanation:
name: FINAL: Sets a name used in the output directory structure.tasks:: Defines the dictionary of tasks to be executed. The keys (gating,controls,lincomp, etc.) are arbitrary names for the tasks within this pipeline.*file:$DIR/gating/gating_task.yaml: This includes the gating definition from another file located in the same directory ($DIR). The*file:syntax is specific todracon.*file:$CALIBRATION_TEMPLATE_DIR/...: Includes predefined task configurations from theCalibrationTemplatesdirectory.$CALIBRATION_TEMPLATE_DIRis likely a variable set by the environment ordraconconfiguration. This promotes reuse of standard task settings.- Task Order: The
priorityfield within each included task file (e.g., in4color_controls.yaml) determines the execution order (lower priority runs first). The sequence here is roughly:gating(Priority 0.0): Filters cells based on scatter properties.beads(Priority 5): Pre-calculates bead information needed for MEF conversion, but doesn't apply it yet (apply_on_load: false).controls(Priority 10): Loads control files specified inexperiment.json5, calculates metrics.lincomp(Priority 20): Performs linear compensation based on controls.protmap(Priority 40): Maps protein abundances to the reference protein's scale (EBFP2 in this case).export(Priority 1000): Exports the final calibrated data (abundances_MEFproduced implicitly byProteinMappingwhenapply_mef_transform: trueand thebeadstask has run).
- Overriding: You can override parameters from included files directly in this main file (demonstrated by the commented-out
lincompoverride example).
3. Running the Pipeline¶
Execute the pipeline from the project root directory using the calibrie-run script:
calibrie-run --pipeline experiments/myxp/calibration/regular.yaml --xpfile experiments/myxp/experiment.json5
Command Breakdown:
calibrie-run: The command-line script installed with the package.--pipeline experiments/myxp/calibration/regular.yaml: Specifies the main pipeline definition file.--xpfile experiments/myxp/experiment.json5: Specifies the experiment metadata file.calibrie-runwill parse this and make$CONTROL_FILES(a dictionary mapping control names like 'EBFP2' to their full paths) and$BEAD_FILE(the full path to the beads file) available for substitution within the YAML pipeline definition.--datapath experiments/myxp/data/raw_data/(Optional): Specifies the base directory for relative file paths inexperiment.json5. If omitted, it defaults to./data/raw_data/.--outputdir output/myxp_calibrated(Optional): Specifies where to save results. Defaults to a directory based on the experiment and pipeline names.--diagnostics(Optional, default: True): Generate diagnostic plots. Use--no-diagnosticsto disable.--diagnostics_output_dir output/myxp_diagnostics(Optional): Specify where to save diagnostic plots. Defaults to a subdirectory within the main output directory.
4. Expected Output¶
After running, Calibrie will:
- Initialize Tasks: Load controls, calculate spillover matrix, fit mapping functions, process beads, etc.
- Process Samples: For each non-control file listed in
experiment.json5:- Load the data.
- Apply gating.
- Apply linear compensation.
- Apply protein mapping.
- Apply MEF calibration (implicitly done by
ProteinMappingusing the info from thebeadstask). - Export the final
abundances_MEFdata to a file (e.g., Parquet or CSV) in the specified output directory.
- Generate Diagnostics: If enabled, create plots for each task (spillover matrix, mapping functions, bead calibration plots, etc.) and save them, typically as a combined PDF or individual files in the diagnostics output directory.
You now have calibrated fluorescence data in MEF units (specifically, "MEF units relative to EBFP2 on the Pacific Blue channel" in this example) ready for downstream analysis!