User's Guide ============ The ``skeleton_keys`` package supports the skeletal analysis of neuron morphologies. It is used to perform a series of analysis steps in a consistent and scaleable manner. It can be used to align a series of morphologies from the isocortex to a set of reference layer depths/thicknesses, create layer-aligned depth profile histograms, and calculate morphological features using the `neuron_morphology package `_. The main inputs for ``skeleton_keys`` are neuron morphologies in the form of SWC files, layer drawings for the cells of interest, and reference information such as a reference set of layer depths/thicknesses. The package contains several command line scripts that are typically used in sequence to process a set of morphologies and end up with a comparable set of morphological features for the entire set. Aligning a morphology to a reference set of layers -------------------------------------------------- The depths and thicknesses of cortical layers can vary from location to location across the isocortex. However, some cells send processes into specific layers regardless of their location. Analyses that use the cells only at their original size may blur precise distinctions that follow layer boundaries; therefore, several features calculated by this package rely on first aligning morphologies to a consistent set of layer thicknesses/depths. These layer-aligned morphologies can also be used for other purposes, like visualization. Since the isocortex is a curved structure, the depth is not always straightforward to calculate. The `neuron_morphology package`_ (used by ``skeleton_keys`` to perform the layer alignemnt) takes the approach used in the `Allen Common Coordinate Framework for the mouse brain `_, where paths traveling through isopotential surfaces between the pia and white matter define the depth dimension of the cortex. Therefore, to align to a common set of layers, we need information about where the pia, white matter, and layers are with respect to the cell of interest. At the Allen Institute for Brain Science, these are drawn on 20x images of the morphology, using a DAPI stain to identify layers. These lines and polygons become inputs into the ``skeleton_keys`` scripts. These layer drawings match the orientation of the reconstructed cell taken directly from the image (i.e., before any rotation to place pia upward is done). As an example, we will use an Sst+ inhibitory neuron from the `Gouwens et al. (2020) study `_ where we have also saved the layer drawings in the ``example_data`` directory as a JSON file in the format expected by ``skeleton_keys``. In its original orientation, the morphology looks like this: .. figure:: /images/guide_example_sst_original.png :width: 600 Sst neuron in original orientation. Dendrite is red, axon is blue, soma is black dot. Our layer drawings (with a matched orientation) look like: .. figure:: /images/guide_example_layer_drawings.png :width: 600 Pia, white matter, soma outline, and layer drawings And together (after aligning the soma locations) they look like: .. figure:: /images/guide_example_layer_drawings_with_morph.png :width: 600 Morphology and layer drawings in original orientation **Note:** This script also has functionality to correct the morphology for shrinkage (which can happen because the fixed tissue that is image dries out and becomes flatter than the original) and slice angle tilt (which happens when the cutting angle for brain slices does not match the curvature of that part of the brain). However, these features are currently written to require access to an internal Allen Institute database and do not yet have alternative input formats. Therefore, we will not use those functions in this guide. We will supply the script with the following inputs: * *specimen_id* - an integer identifier for the cell. Here, it is primarily used to access internal database information (which we aren't doing), but in other scripts it is used to associate this cell with its features. Here, the specimen ID of our example is ``740135032``. * *swc_path* - a file path to the SWC file in its original orientation. Here, our example cell's SWC file is ``Sst-IRES-Cre_Ai14-408415.03.02.02_864876723_m.swc``. * *surface_and_layers_file* - a JSON file with the layer drawings. Here, our example uses the file ``740135032_surfaces_and_layers.json``. * *layer_depths_file* - a JSON file with the set of layer depths we're aligning the cell to. In this case we'll use an average set of depths included as a test file, ``avg_layer_depths.json``. Therefore, our command will be: .. code:: shell skelekeys-layer-aligned-swc --specimen_id 740135032 \ --swc_path Sst-IRES-Cre_Ai14-408415.03.02.02_864876723_m.swc \ --surface_and_layers_file 740135032_surfaces_and_layers.json \ --layer_depths_file avg_layer_depths.json \ --correct_for_shrinkage False \ --correct_for_slice_angle False \ --output_file layer_aligned_740135032.swc This creates a new SWC file (``layer_aligned_740135032.swc``) that is (1) uprighted and (2) stretched or squished to align each of its points to the reference set of layers. .. figure:: /images/guide_example_sst_layeraligned.png :width: 600 Layer-aligned morphology Uprighting a morphology without layer alignment ----------------------------------------------- If you only want to orient morphology so that pia is up and the white matter is down, but without making any layer thickness adjustments, you can use the command-line utility ``skelekeys-upright-corrected-swc``. It still requires layer drawings, though, to know which direction the pia and white matter are relative to the originally reconstructed morphology. **Note:** This script, too, can correct the morphology for shrinkage and slice angle tilt, but we are again skipping that since it currently can only use internally databased information. It takes a similar set of arguments as before (but notably without the ``--layer_depths_file`` argument). .. code:: shell skelekeys-upright-corrected-swc --specimen_id 740135032 \ --swc_path Sst-IRES-Cre_Ai14-408415.03.02.02_864876723_m.swc \ --surface_and_layers_file 740135032_surfaces_and_layers.json \ --correct_for_shrinkage False \ --correct_for_slice_angle False \ --output_file upright_only_740135032.swc The output looks like: .. figure:: /images/guide_example_sst_upright.png :width: 600 Upright (but not layer-aligned) morphology Calculating depth profiles -------------------------- A relevant aspect for distinguishing morphologies is the depth profile, which is a 1D histogram of the number of nodes across a set of depth bins, divided by compartment type (i.e., axon, basal dendrite, apical dendrite). These can be used to calculate reduced-dimesion representations of those profiles, determine the overlap of different compartment types, etc. The command line utility ``skelekeys-profiles-from-swcs`` will create a CSV file of depth profiles from a list of layer-aligned SWC files. The script expects the layer-aligned SWC files to be in a single directory (``--swc_dir``) and be named as ``{specimen_id}.swc``. For this example, we have moved the layer-aligned Sst cell's SWC file (740135032) into another directory and renamed it; we have also layer aligned a Pvalb cell (606271263) and a Vip cell (694146546). .. figure:: /images/guide_three_layeraligned_cells.png :width: 600 Layer-aligned Sst cell, Pvalb cell, and Vip cell To run the script for our example, we give it the following inputs: * *specimen_id_file* - a text file with specimen IDs The text file has one integer ID per line. Here we're using ``example_specimen_ids.txt``. * *swc_dir* - the directory of layer-aligned SWC files Here our directory is ``layer_aligned_swcs`` * *layer_depths_file* - a JSON file with the reference set of layer depths This is so that the script knows where the white matter begins (so that it can determine how far past to include) * *output_hist_file* - an output CSV file path This CSV file contains the depth histograms - the columns are depth bins, and the rows are cells * *output_soma_file* - an output CSV file path The script also saves the layer-aligned soma depths, which is used by other command line scripts in the package Our command will then be: .. code:: shell skelekeys-profiles-from-swcs --specimen_id_file example_specimen_ids.txt \ --swc_dir layer_aligned_swcs \ --layer_depths_file avg_layer_depths.json \ --output_hist_file aligned_depth_profiles.csv \ --output_soma_file aligned_soma_depths.csv This produces the following depth profiles: .. figure:: /images/guide_depth_profiles.png :width: 600 Layer-aligned depth profiles of the three example cells PCA on depth profiles --------------------- A reduced dimension representation of the depth profiles can serve as useful features for distinguishing morphologies. However, analyses like PCA will produce loadings that vary depending on the data set. The command line utility ``skelekeys-calc-histogram-loadings`` can be used to generate a fixed loading file from a set of morphologies that can be used to analyze other morphologies with other scripts. If you simply want to use the PCA loadings for a new set of cells and not transfer those loadings to another set, you do not need to use this utility (you can use the ``skelekeys-morph-features`` script by itself). That script will also let you save the loadings. But this script will allow you to calculate loadings without having to calculate all the other morphological features. In this example, we will calculate and save the PCA loadings for the axonal compartments. The command to do so is: .. code:: shell skelekeys-calc-histogram-loadings --specimen_id_file example_specimen_ids.txt \ --aligned_depth_profile_file aligned_depth_profiles.csv \ --analyze_axon True \ --save_axon_depth_profile_loadings_file axon_loadings.csv Calculating morphological features ---------------------------------- Once we have the layer depths files (and optionally a set of pre-calculated loadings), we can calculate a set of morphological features for each cell using the command line utility ``skelekeys-morph-features``. The main inputs are the set of upright (but *not* layer-aligned) SWC files and the depth profile CSV file. We also specify which compartments we want to analyze (for example if we have a set of excitatory neurons that don't have reconstructed local axons, we would not want to analyze axonal compartments). To continue our example, we will analyze the features using the following command: .. code:: shell skelekeys-morph-features --specimen_id_file example_specimen_ids.txt \ --swc_dir upright_swcs \ --aligned_depth_profile_file aligned_depth_profiles.csv \ --aligned_soma_file aligned_soma_depths.csv \ --analyze_axon True \ --analyze_basal_dendrite True \ --analyze_apical_dendrite False \ --output_file example_features_long.csv This produces a long-form feature data file. .. csv-table:: Beginning of example long-form feature file :file: guide_example_features_long_excerpt.csv :header-rows: 1 Post-processing morphological features -------------------------------------- The long-form file can be used for many purposes, but it is also useful to convert the data to a wide format where the features are the columns and the rows are cells. At the same time, it can also be useful to normalize the different features for analyzes like classification and clustering. The command line utility ``skelekeys-postprocess-features`` is used to perform these operations. .. code:: shell skelekeys-postprocess-features \ --input_files "['example_features_long.csv']" \ --wide_normalized_output_file example_features_wide_normalized.csv \ --wide_unnormalized_output_file example_features_wide_unnormalized.csv Note the `syntax for passing a list of files `_ for the `argschema `_ command line argument. If you passed more than one file (for example, to normalize features calculated from two sets of cells to the same scale), you would separate each argument with a comma (as in ``"['file_one.csv','file_two.csv']"``). The wide form feature file output looks like: .. csv-table:: Example wide-form feature file :file: guide_example_features_wide.csv :header-rows: 1 Working with general coordinates -------------------------------- The ``skeleton_keys`` package was originally built around processing SWC files, but it can also be used to align arbitrary sets of coordinates, if provided with the appropriate layer drawings. The command line utility ``skelekeys-layer-aligned-coords`` can take a CSV and adjust the specified coordinates to make them layer-aligned. It takes the following inputs: * *coordinate_file* - a CSV with coordinates The coordinate columns must contain "x", "y", and "z" in their names, but can have prefixes and/or suffixes (see below). We'll use an example file named ``coord_example.csv``. * *layer_depths_file* a JSON file with the set of layer depths we're aligning the cell to. Here again we'll use an average set of depths included as a test file, ``avg_layer_depths.json``. * *surface_and_layers_file* - a JSON file with the layer drawings. Here, our example uses the file ``coord_layer_drawings.json``. * *coordinate_column_prefix (and/or _suffix)* - strings of common prefixes/suffixes This allows the coordinate columns to have names other than the default ``x``, ``y``, and ``z``, but in this example we do not need to use them. Using this, we can take a starting example file (the columns ``cell_id`` and ``target_cell_type`` contain extra metadata about the coordinates): .. csv-table:: Example coordinate file :file: guide_coord_example.csv :header-rows: 1 Use the command: .. code:: shell skelekeys-layer-aligned-coords \ --coordinate_file coord_example.csv \ --surface_and_layers_file coord_layer_drawings.json \ --layer_depths_file avg_layer_depths.json \ --output_file aligned_coord_example.csv And obtain: .. csv-table:: Example coordinate file :file: guide_aligned_coord_example.csv :header-rows: 1 Note that the ``x`` values have also changed because we have rotated the coordinates to an upright orientation (with pia at the top). This aligned coordinate file can be used to generate histograms with the ``skelekeys-profiles-from-coords`` command line utility. You need to specify which column contains the depth values (here, the column labeled ``y``) with the ``depth_label`` argument. You can use other columns in the CSV file to split the histograms across rows using the ``--index_label`` argument, and/or you can create multiple histograms per row (as in the compartment type histograms above) with the ``--hist_split_label`` argument. For example: .. code:: shell skelekeys-profiles-from-coords \ --coordinate_file aligned_coord_example.csv \ --layer_depths_file avg_layer_depths.json \ --depth_label y \ --index_label cell_id \ --hist_split_label target_cell_type \ --output_hist_file aligned_coord_hist.csv