{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Batch processing (the AGExperiment object)\n",
    "The main point of AliGater is to batch proccess many samples using the same strategy. This is mainly orchestrated through the AGExperiment object.\n",
    "\n",
    "An AGExperiment has some settings that merits some explanation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "AliGater started in Jupyter mode\n"
     ]
    }
   ],
   "source": [
    "import aligater as ag"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At its core an AGExperiment takes a _path_ or a list of complete filepaths. The two below ways of initializing an AGExperiment are equal."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "No experiment name specified, generated name: AGexperiment_2022-12-05_14_28_20.051722\n",
      "Collected 3 files, 0 files did not pass filter(s) and mask(s).\n"
     ]
    }
   ],
   "source": [
    "#Initializing with a folder path\n",
    "exp=ag.AGExperiment(ag.AGConfig.ag_home+\"/tutorial/data\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "No experiment name specified, generated name: AGexperiment_2022-12-05_14_28_20.075521\n",
      "Experiment initialised with file list. Checking entries...\n",
      "All file paths exists.\n",
      "Collected 3 files, 0 files did not pass filter(s) and mask(s).\n"
     ]
    }
   ],
   "source": [
    "#Initializing with a file list\n",
    "sample_list=[ag.AGConfig.ag_home+\"/tutorial/data/Compensated.fcs\",\n",
    "             ag.AGConfig.ag_home+\"/tutorial/data/Uncompensated.fcs\",\n",
    "             ag.AGConfig.ag_home+\"/tutorial/data/example1.fcs\"]\n",
    "exp=ag.AGExperiment(sample_list)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Naming an experiment**\n",
    "\n",
    "To organise output from the experiment it's useful to name Experiment, which is done through the experiment_name option. Output will be placed in a folder with this name inside the path defined in AGConfig.ag_out.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Collected 3 files, 0 files did not pass filter(s) and mask(s).\n"
     ]
    }
   ],
   "source": [
    "#Initializing with a folder path\n",
    "exp=ag.AGExperiment(ag.AGConfig.ag_home+\"/tutorial/data\", experiment_name=\"tutorial\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As seen above, if such a folder already exists, aligater will print a warning. Aligater will print output to this folder, and if content is already present with the same name, that content will be overwritten without confirmation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Filters & masks**\n",
    "When collecting files by specifying a folder it might be useful to apply filters to guide the selection, such as case or sample vs control. This can be done by supplying two lists. Note that filters are _case sensitive_ :\n",
    "\n",
    "**filters** - Filters should be a list-like containing strings, if _any_ part of the file path is matched by one or more of the filters, the file is collected.\n",
    "\n",
    "**masks** - mask should be list-like containing strings, if _any_ party of the file path is matched by one or more of the filters, the file is discarded."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "1 filter(s) defined\n",
      "Collected 1 files, 2 files did not pass filter(s) and mask(s).\n"
     ]
    }
   ],
   "source": [
    "#Single filter 'compensated'\n",
    "exp=ag.AGExperiment(ag.AGConfig.ag_home+\"/tutorial/data\",\n",
    "                    filters=['compensated'],\n",
    "                    experiment_name=\"tutorial\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "1 filter(s) defined\n",
      "Collected 2 files, 1 files did not pass filter(s) and mask(s).\n"
     ]
    }
   ],
   "source": [
    "#Single filter 'ompensated' yields another file\n",
    "exp=ag.AGExperiment(ag.AGConfig.ag_home+\"/tutorial/data\",\n",
    "                    filters=['ompensated'],\n",
    "                    experiment_name=\"tutorial\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "1 filter(s) defined\n",
      "1 mask(s) defined\n",
      "Collected 1 files, 2 files did not pass filter(s) and mask(s).\n"
     ]
    }
   ],
   "source": [
    "#Adding a mask will reduce the number of collected files again\n",
    "exp=ag.AGExperiment(ag.AGConfig.ag_home+\"/tutorial/data\",\n",
    "                    filters=['ompensated'],\n",
    "                    mask=['Uncompensated'],\n",
    "                    experiment_name=\"tutorial\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Useful flags & options**\n",
    "\n",
    "There are several additional flags and options that can be supplied to the AGExperiment, see the functions documentation information."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Running batch analysis\n",
    "After developing a pattern recognition strategy, it's recommended to put it in it's own python script.\n",
    "You could then import these gating functions in a batch processing script\n",
    "We'll do the following:\n",
    "\n",
    "1) set up our aligater experiment object\n",
    "\n",
    "2) import the gating strategy from a separate python script\n",
    "\n",
    "3) batch analyse all samples collected in the experiment object with the imported strategy\n",
    "\n",
    "4) output our results to file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "1 filter(s) defined\n",
      "1 mask(s) defined\n",
      "Collected 1 files, 2 files did not pass filter(s) and mask(s).\n"
     ]
    }
   ],
   "source": [
    "exp=ag.AGExperiment(ag.AGConfig.ag_home+\"/tutorial/data\",\n",
    "                    filters=['example'],\n",
    "                    mask=['Uncompensated'],\n",
    "                    experiment_name=\"tutorial\",\n",
    "                    flourochrome_area_filter=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "from example_strategy import example_gating_strategy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Loading sample 0 to 1\n",
      "Opening file example1 from folder /tutorial/data\n",
      "Applying strategy to sample 0 to 1\n",
      "Sample gating done\n",
      "Complete, no samples had populations with invalid flags\n"
     ]
    }
   ],
   "source": [
    "exp.apply(example_gating_strategy, n_processes=2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "#results can then be output through printexperiment\n",
    "exp.printExperiment(ag.AGConfig.ag_home+\"/out/example_output.txt\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['tutorial/data/example1',\n",
       "  427330.0,\n",
       "  0.9905449353166021,\n",
       "  422823.0,\n",
       "  0.989453115858938,\n",
       "  276069.0,\n",
       "  0.6529185971434855,\n",
       "  146754.0,\n",
       "  0.34708140285651445,\n",
       "  139989.0,\n",
       "  0.5070797518011801,\n",
       "  122765.0,\n",
       "  0.44468955224961876,\n",
       "  5705.0,\n",
       "  0.02066512357417892,\n",
       "  7610.0,\n",
       "  0.027565572375022187]]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# We can take a peek at the results manually by inspecting the resultMatrix member of the experiment object\n",
    "exp.resultMatrix"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}