Batch processing (the AGExperiment object)

The main point of AliGater is to batch proccess many samples using the same strategy. This is mainly orchestrated through the AGExperiment object.

An AGExperiment has some settings that merits some explanation

[1]:

import aligater as ag

AliGater started in Jupyter mode

At its core an AGExperiment takes a path or a list of complete filepaths. The two below ways of initializing an AGExperiment are equal.

[2]:

#Initializing with a folder path
exp=ag.AGExperiment(ag.AGConfig.ag_home+"/tutorial/data")

No experiment name specified, generated name: AGexperiment_2022-12-05_14_28_20.051722
Collected 3 files, 0 files did not pass filter(s) and mask(s).

[3]:

#Initializing with a file list
sample_list=[ag.AGConfig.ag_home+"/tutorial/data/Compensated.fcs",
             ag.AGConfig.ag_home+"/tutorial/data/Uncompensated.fcs",
             ag.AGConfig.ag_home+"/tutorial/data/example1.fcs"]
exp=ag.AGExperiment(sample_list)

No experiment name specified, generated name: AGexperiment_2022-12-05_14_28_20.075521
Experiment initialised with file list. Checking entries...
All file paths exists.
Collected 3 files, 0 files did not pass filter(s) and mask(s).

Naming an experiment

To organise output from the experiment it’s useful to name Experiment, which is done through the experiment_name option. Output will be placed in a folder with this name inside the path defined in AGConfig.ag_out.

[4]:

#Initializing with a folder path
exp=ag.AGExperiment(ag.AGConfig.ag_home+"/tutorial/data", experiment_name="tutorial")

Collected 3 files, 0 files did not pass filter(s) and mask(s).

As seen above, if such a folder already exists, aligater will print a warning. Aligater will print output to this folder, and if content is already present with the same name, that content will be overwritten without confirmation.

Filters & masks When collecting files by specifying a folder it might be useful to apply filters to guide the selection, such as case or sample vs control. This can be done by supplying two lists. Note that filters are case sensitive :

filters - Filters should be a list-like containing strings, if any part of the file path is matched by one or more of the filters, the file is collected.

masks - mask should be list-like containing strings, if any party of the file path is matched by one or more of the filters, the file is discarded.

[5]:

#Single filter 'compensated'
exp=ag.AGExperiment(ag.AGConfig.ag_home+"/tutorial/data",
                    filters=['compensated'],
                    experiment_name="tutorial")

1 filter(s) defined
Collected 1 files, 2 files did not pass filter(s) and mask(s).

[6]:

#Single filter 'ompensated' yields another file
exp=ag.AGExperiment(ag.AGConfig.ag_home+"/tutorial/data",
                    filters=['ompensated'],
                    experiment_name="tutorial")

1 filter(s) defined
Collected 2 files, 1 files did not pass filter(s) and mask(s).

[7]:

#Adding a mask will reduce the number of collected files again
exp=ag.AGExperiment(ag.AGConfig.ag_home+"/tutorial/data",
                    filters=['ompensated'],
                    mask=['Uncompensated'],
                    experiment_name="tutorial")

1 filter(s) defined
1 mask(s) defined
Collected 1 files, 2 files did not pass filter(s) and mask(s).

Useful flags & options

There are several additional flags and options that can be supplied to the AGExperiment, see the functions documentation information.

Running batch analysis

After developing a pattern recognition strategy, it’s recommended to put it in it’s own python script. You could then import these gating functions in a batch processing script We’ll do the following:

set up our aligater experiment object
import the gating strategy from a separate python script
batch analyse all samples collected in the experiment object with the imported strategy
output our results to file

[8]:

exp=ag.AGExperiment(ag.AGConfig.ag_home+"/tutorial/data",
                    filters=['example'],
                    mask=['Uncompensated'],
                    experiment_name="tutorial",
                    flourochrome_area_filter=True)

1 filter(s) defined
1 mask(s) defined
Collected 1 files, 2 files did not pass filter(s) and mask(s).

[9]:

from example_strategy import example_gating_strategy

[10]:

exp.apply(example_gating_strategy, n_processes=2)

Loading sample 0 to 1
Opening file example1 from folder /tutorial/data
Applying strategy to sample 0 to 1
Sample gating done
Complete, no samples had populations with invalid flags

[11]:

#results can then be output through printexperiment
exp.printExperiment(ag.AGConfig.ag_home+"/out/example_output.txt")

[12]:

# We can take a peek at the results manually by inspecting the resultMatrix member of the experiment object
exp.resultMatrix

[12]:

[['tutorial/data/example1',
  427330.0,
  0.9905449353166021,
  422823.0,
  0.989453115858938,
  276069.0,
  0.6529185971434855,
  146754.0,
  0.34708140285651445,
  139989.0,
  0.5070797518011801,
  122765.0,
  0.44468955224961876,
  5705.0,
  0.02066512357417892,
  7610.0,
  0.027565572375022187]]