User’s Guide#

The skeleton_keys package supports the skeletal analysis of neuron morphologies. It is used to perform a series of analysis steps in a consistent and scaleable manner. It can be used to align a series of morphologies from the isocortex to a set of reference layer depths/thicknesses, create layer-aligned depth profile histograms, and calculate morphological features using the neuron_morphology package.

The main inputs for skeleton_keys are neuron morphologies in the form of SWC files, layer drawings for the cells of interest, and reference information such as a reference set of layer depths/thicknesses. The package contains several command line scripts that are typically used in sequence to process a set of morphologies and end up with a comparable set of morphological features for the entire set.

Aligning a morphology to a reference set of layers#

The depths and thicknesses of cortical layers can vary from location to location across the isocortex. However, some cells send processes into specific layers regardless of their location. Analyses that use the cells only at their original size may blur precise distinctions that follow layer boundaries; therefore, several features calculated by this package rely on first aligning morphologies to a consistent set of layer thicknesses/depths. These layer-aligned morphologies can also be used for other purposes, like visualization.

Since the isocortex is a curved structure, the depth is not always straightforward to calculate. The neuron_morphology package (used by skeleton_keys to perform the layer alignemnt) takes the approach used in the Allen Common Coordinate Framework for the mouse brain, where paths traveling through isopotential surfaces between the pia and white matter define the depth dimension of the cortex.

Therefore, to align to a common set of layers, we need information about where the pia, white matter, and layers are with respect to the cell of interest. At the Allen Institute for Brain Science, these are drawn on 20x images of the morphology, using a DAPI stain to identify layers. These lines and polygons become inputs into the skeleton_keys scripts.

These layer drawings match the orientation of the reconstructed cell taken directly from the image (i.e., before any rotation to place pia upward is done).

As an example, we will use an Sst+ inhibitory neuron from the Gouwens et al. (2020) study where we have also saved the layer drawings in the example_data directory as a JSON file in the format expected by skeleton_keys.

In its original orientation, the morphology looks like this:

_images/guide_example_sst_original.png

Sst neuron in original orientation. Dendrite is red, axon is blue, soma is black dot.#

Our layer drawings (with a matched orientation) look like:

_images/guide_example_layer_drawings.png

Pia, white matter, soma outline, and layer drawings#

And together (after aligning the soma locations) they look like:

_images/guide_example_layer_drawings_with_morph.png

Morphology and layer drawings in original orientation#

Note: This script also has functionality to correct the morphology for shrinkage (which can happen because the fixed tissue that is image dries out and becomes flatter than the original) and slice angle tilt (which happens when the cutting angle for brain slices does not match the curvature of that part of the brain). However, these features are currently written to require access to an internal Allen Institute database and do not yet have alternative input formats. Therefore, we will not use those functions in this guide.

We will supply the script with the following inputs:

  • specimen_id - an integer identifier for the cell.

    Here, it is primarily used to access internal database information (which we aren’t doing), but in other scripts it is used to associate this cell with its features. Here, the specimen ID of our example is 740135032.

  • swc_path - a file path to the SWC file in its original orientation.

    Here, our example cell’s SWC file is Sst-IRES-Cre_Ai14-408415.03.02.02_864876723_m.swc.

  • surface_and_layers_file - a JSON file with the layer drawings.

    Here, our example uses the file 740135032_surfaces_and_layers.json.

  • layer_depths_file - a JSON file with the set of layer depths we’re aligning the cell to.

    In this case we’ll use an average set of depths included as a test file, avg_layer_depths.json.

Therefore, our command will be:

skelekeys-layer-aligned-swc --specimen_id 740135032 \
--swc_path Sst-IRES-Cre_Ai14-408415.03.02.02_864876723_m.swc \
--surface_and_layers_file 740135032_surfaces_and_layers.json \
--layer_depths_file avg_layer_depths.json \
--correct_for_shrinkage False \
--correct_for_slice_angle False \
--output_file layer_aligned_740135032.swc

This creates a new SWC file (layer_aligned_740135032.swc) that is (1) uprighted and (2) stretched or squished to align each of its points to the reference set of layers.

_images/guide_example_sst_layeraligned.png

Layer-aligned morphology#

Uprighting a morphology without layer alignment#

If you only want to orient morphology so that pia is up and the white matter is down, but without making any layer thickness adjustments, you can use the command-line utility skelekeys-upright-corrected-swc. It still requires layer drawings, though, to know which direction the pia and white matter are relative to the originally reconstructed morphology.

Note: This script, too, can correct the morphology for shrinkage and slice angle tilt, but we are again skipping that since it currently can only use internally databased information.

It takes a similar set of arguments as before (but notably without the --layer_depths_file argument).

skelekeys-upright-corrected-swc --specimen_id 740135032 \
--swc_path Sst-IRES-Cre_Ai14-408415.03.02.02_864876723_m.swc \
--surface_and_layers_file 740135032_surfaces_and_layers.json \
--correct_for_shrinkage False \
--correct_for_slice_angle False \
--output_file upright_only_740135032.swc

The output looks like:

_images/guide_example_sst_upright.png

Upright (but not layer-aligned) morphology#

Calculating depth profiles#

A relevant aspect for distinguishing morphologies is the depth profile, which is a 1D histogram of the number of nodes across a set of depth bins, divided by compartment type (i.e., axon, basal dendrite, apical dendrite). These can be used to calculate reduced-dimesion representations of those profiles, determine the overlap of different compartment types, etc.

The command line utility skelekeys-profiles-from-swcs will create a CSV file of depth profiles from a list of layer-aligned SWC files.

The script expects the layer-aligned SWC files to be in a single directory (--swc_dir) and be named as {specimen_id}.swc. For this example, we have moved the layer-aligned Sst cell’s SWC file (740135032) into another directory and renamed it; we have also layer aligned a Pvalb cell (606271263) and a Vip cell (694146546).

_images/guide_three_layeraligned_cells.png

Layer-aligned Sst cell, Pvalb cell, and Vip cell#

To run the script for our example, we give it the following inputs:

  • specimen_id_file - a text file with specimen IDs

    The text file has one integer ID per line. Here we’re using example_specimen_ids.txt.

  • swc_dir - the directory of layer-aligned SWC files

    Here our directory is layer_aligned_swcs

  • layer_depths_file - a JSON file with the reference set of layer depths

    This is so that the script knows where the white matter begins (so that it can determine how far past to include)

  • output_hist_file - an output CSV file path

    This CSV file contains the depth histograms - the columns are depth bins, and the rows are cells

  • output_soma_file - an output CSV file path

    The script also saves the layer-aligned soma depths, which is used by other command line scripts in the package

Our command will then be:

skelekeys-profiles-from-swcs --specimen_id_file example_specimen_ids.txt \
--swc_dir layer_aligned_swcs \
--layer_depths_file avg_layer_depths.json \
--output_hist_file aligned_depth_profiles.csv \
--output_soma_file aligned_soma_depths.csv

This produces the following depth profiles:

_images/guide_depth_profiles.png

Layer-aligned depth profiles of the three example cells#

PCA on depth profiles#

A reduced dimension representation of the depth profiles can serve as useful features for distinguishing morphologies. However, analyses like PCA will produce loadings that vary depending on the data set. The command line utility skelekeys-calc-histogram-loadings can be used to generate a fixed loading file from a set of morphologies that can be used to analyze other morphologies with other scripts.

If you simply want to use the PCA loadings for a new set of cells and not transfer those loadings to another set, you do not need to use this utility (you can use the skelekeys-morph-features script by itself). That script will also let you save the loadings. But this script will allow you to calculate loadings without having to calculate all the other morphological features.

In this example, we will calculate and save the PCA loadings for the axonal compartments. The command to do so is:

skelekeys-calc-histogram-loadings --specimen_id_file example_specimen_ids.txt \
--aligned_depth_profile_file aligned_depth_profiles.csv \
--analyze_axon True \
--save_axon_depth_profile_loadings_file axon_loadings.csv

Calculating morphological features#

Once we have the layer depths files (and optionally a set of pre-calculated loadings), we can calculate a set of morphological features for each cell using the command line utility skelekeys-morph-features. The main inputs are the set of upright (but not layer-aligned) SWC files and the depth profile CSV file. We also specify which compartments we want to analyze (for example if we have a set of excitatory neurons that don’t have reconstructed local axons, we would not want to analyze axonal compartments).

To continue our example, we will analyze the features using the following command:

skelekeys-morph-features  --specimen_id_file example_specimen_ids.txt \
--swc_dir upright_swcs \
--aligned_depth_profile_file aligned_depth_profiles.csv \
--aligned_soma_file aligned_soma_depths.csv \
--analyze_axon True \
--analyze_basal_dendrite True \
--analyze_apical_dendrite False \
--output_file example_features_long.csv

This produces a long-form feature data file.

Beginning of example long-form feature file#

specimen_id

feature

compartment_type

dimension

value

0

740135032

aligned_dist_from_pia

soma

none

488.95489922056464

1

606271263

aligned_dist_from_pia

soma

none

185.35510042590795

2

694146546

aligned_dist_from_pia

soma

none

363.1829539564272

3

740135032

depth_pc_0

axon

none

-591.3768052788041

4

740135032

depth_pc_1

axon

none

771.0611928161894

5

606271263

depth_pc_0

axon

none

1557.1882340570776

6

606271263

depth_pc_1

axon

none

-114.4320436027042

7

694146546

depth_pc_0

axon

none

-965.8114287782732

8

694146546

depth_pc_1

axon

none

-656.6291492134859

9

740135032

emd_with_basal_dendrite

axon

none

53.842009790570955

10

606271263

emd_with_basal_dendrite

axon

none

20.02095840833445

11

694146546

emd_with_basal_dendrite

axon

none

50.38248981181829

12

740135032

frac_above_basal_dendrite

axon

none

0.5537837260625045

13

740135032

frac_intersect_basal_dendrite

axon

none

0.4462162739374956

14

740135032

frac_below_basal_dendrite

axon

none

0.0

15

606271263

frac_above_basal_dendrite

axon

none

0.0

16

606271263

frac_intersect_basal_dendrite

axon

none

0.7648047321368555

17

606271263

frac_below_basal_dendrite

axon

none

0.23519526786314446

18

694146546

frac_above_basal_dendrite

axon

none

0.0

19

694146546

frac_intersect_basal_dendrite

axon

none

0.9692556634304207

20

694146546

frac_below_basal_dendrite

axon

none

0.030744336569579287

21

740135032

frac_above_axon

basal_dendrite

none

0.0

22

740135032

frac_intersect_axon

basal_dendrite

none

0.946916890080429

23

740135032

frac_below_axon

basal_dendrite

none

0.05308310991957105

24

606271263

frac_above_axon

basal_dendrite

none

0.37383177570093457

25

606271263

frac_intersect_axon

basal_dendrite

none

0.6261682242990654

26

606271263

frac_below_axon

basal_dendrite

none

0.0

27

694146546

frac_above_axon

basal_dendrite

none

0.47685185185185186

28

694146546

frac_intersect_axon

basal_dendrite

none

0.5231481481481481

29

694146546

frac_below_axon

basal_dendrite

none

0.0

30

740135032

extent

basal_dendrite

x

259.0984945349686

31

740135032

extent

basal_dendrite

y

360.15791867251323

32

740135032

bias

basal_dendrite

x

105.9123765215528

33

740135032

bias

basal_dendrite

y

-5.017693772805842

Post-processing morphological features#

The long-form file can be used for many purposes, but it is also useful to convert the data to a wide format where the features are the columns and the rows are cells. At the same time, it can also be useful to normalize the different features for analyzes like classification and clustering.

The command line utility skelekeys-postprocess-features is used to perform these operations.

skelekeys-postprocess-features \
--input_files "['example_features_long.csv']" \
--wide_normalized_output_file example_features_wide_normalized.csv \
--wide_unnormalized_output_file example_features_wide_unnormalized.csv

Note the syntax for passing a list of files for the argschema command line argument. If you passed more than one file (for example, to normalize features calculated from two sets of cells to the same scale), you would separate each argument with a comma (as in "['file_one.csv','file_two.csv']").

The wide form feature file output looks like:

Example wide-form feature file#

specimen_id

axon_bias_x

axon_bias_y

axon_depth_pc_0

axon_depth_pc_1

axon_emd_with_basal_dendrite

axon_exit_distance

axon_exit_theta

axon_extent_x

axon_extent_y

axon_frac_above_basal_dendrite

axon_frac_below_basal_dendrite

axon_frac_intersect_basal_dendrite

axon_max_branch_order

axon_max_euclidean_distance

axon_max_path_distance

axon_mean_contraction

axon_num_branches

axon_num_outer_bifurcations

axon_soma_percentile_x

axon_soma_percentile_y

axon_total_length

basal_dendrite_bias_x

basal_dendrite_bias_y

basal_dendrite_calculate_number_of_stems

basal_dendrite_extent_x

basal_dendrite_extent_y

basal_dendrite_frac_above_axon

basal_dendrite_frac_below_axon

basal_dendrite_frac_intersect_axon

basal_dendrite_max_branch_order

basal_dendrite_max_euclidean_distance

basal_dendrite_max_path_distance

basal_dendrite_mean_contraction

basal_dendrite_mean_diameter

basal_dendrite_num_branches

basal_dendrite_num_outer_bifurcations

basal_dendrite_soma_percentile_x

basal_dendrite_soma_percentile_y

basal_dendrite_stem_exit_down

basal_dendrite_stem_exit_side

basal_dendrite_stem_exit_up

basal_dendrite_total_length

basal_dendrite_total_surface_area

none_early_branch_path

soma_aligned_dist_from_pia

soma_surface_area

606271263

56.22872989248452

-121.2103061622121

1557.1882340570776

-114.4320436027042

20.02095840833445

0.0

0.6246170133542251

400.4118094004192

262.2553175261422

0.0

0.2351952678631444

0.7648047321368555

25.0

240.6846227360817

902.0221232091972

0.8194136640160532

665.0

1.845098040014257

0.4889426631713383

0.7447065940713854

16935.227964908227

0.1618829328459696

107.3874873605935

7.0

292.8979665049092

227.9213594244781

0.3738317757009345

0.0

0.6261682242990654

5.0

221.48224685084813

261.57785404864563

0.8777370205959968

0.5797508521165475

35.0

0.3010299956639812

0.4870808136338647

0.1423859263331501

0.4285714285714285

0.5714285714285714

0.0

2057.344684556989

3746.221348808403

0.5122275147209207

185.35510042590795

334.48341751636667

694146546

20.979837762279885

-337.62296125250555

-965.8114287782732

-656.6291492134859

50.38248981181829

8.953753130391712

0.8624144236833117

456.4507565320748

329.4824306809105

0.0

0.0307443365695792

0.9692556634304208

14.0

369.0094227941884

537.3808743109979

0.8804317016691116

117.0

1.462397997898956

0.3268608414239482

1.0

3856.927925804772

4.519987715724085

45.38310515181024

3.0

158.97060160570243

640.1652005614803

0.4768518518518518

0.0

0.5231481481481481

9.0

353.0402965262322

434.5470290069107

0.896216086452986

0.6016030092592592

69.0

1.1760912590556813

0.1689814814814815

0.5262345679012346

0.3333333333333333

0.0

0.6666666666666666

3185.736311407383

5983.0167307564

0.7175151344782787

363.1829539564272

240.61616017724623

740135032

5.169444355631811

297.6111166818944

-591.3768052788041

771.0611928161894

53.84200979057096

20.438908268544992

0.5210092861882935

424.4629900629744

587.2419117540196

0.5537837260625045

0.0

0.4462162739374956

24.0

465.7975422331148

712.7996048944084

0.8629205591762148

475.0

2.05307844348342

0.3885527158823948

0.1986189221897914

17053.86846302751

105.9123765215528

-5.017693772805842

4.0

259.0984945349686

360.15791867251323

0.0

0.053083109919571

0.946916890080429

9.0

196.6200075357029

228.6181419435221

0.8927636513643621

0.4823025201072386

44.0

0.7781512503836436

0.3179624664879356

0.8273458445040215

0.0

1.0

0.0

2204.244429816128

3338.8595673807986

0.9062144295279364

488.95489922056464

201.46425475133609

Working with general coordinates#

The skeleton_keys package was originally built around processing SWC files, but it can also be used to align arbitrary sets of coordinates, if provided with the appropriate layer drawings.

The command line utility skelekeys-layer-aligned-coords can take a CSV and adjust the specified coordinates to make them layer-aligned. It takes the following inputs:

  • coordinate_file - a CSV with coordinates

    The coordinate columns must contain “x”, “y”, and “z” in their names, but can have prefixes and/or suffixes (see below). We’ll use an example file named coord_example.csv.

  • layer_depths_file a JSON file with the set of layer depths we’re aligning the cell to.

    Here again we’ll use an average set of depths included as a test file, avg_layer_depths.json.

  • surface_and_layers_file - a JSON file with the layer drawings.

    Here, our example uses the file coord_layer_drawings.json.

  • coordinate_column_prefix (and/or _suffix) - strings of common prefixes/suffixes

    This allows the coordinate columns to have names other than the default x, y, and z, but in this example we do not need to use them.

Using this, we can take a starting example file (the columns cell_id and target_cell_type contain extra metadata about the coordinates):

Example coordinate file#

cell_id

target_cell_type

x

y

z

1

4P

838.184

691.792

821.64

1

4P

850.392

676.904

796.56

2

4P

741.704

649.608

737.76

3

5P-ET

739.656

680.464

950.92

3

5P-IT

762.28

748.584

826.44

Use the command:

skelekeys-layer-aligned-coords \
--coordinate_file coord_example.csv \
--surface_and_layers_file coord_layer_drawings.json \
--layer_depths_file avg_layer_depths.json \
--output_file aligned_coord_example.csv

And obtain:

Example coordinate file#

cell_id

target_cell_type

x

y

z

1

4P

-92.0060469028378

-421.0797644340709

821.64

1

4P

-105.11025186647323

-399.30901042554444

796.56

2

4P

1.6844004295792985

-335.2109182277234

737.76

3

5P-ET

5.634215690400225

-382.48922026392205

950.92

3

5P-IT

-12.739372718728555

-508.88802416730545

826.44

Note that the x values have also changed because we have rotated the coordinates to an upright orientation (with pia at the top).

This aligned coordinate file can be used to generate histograms with the skelekeys-profiles-from-coords command line utility. You need to specify which column contains the depth values (here, the column labeled y) with the depth_label argument.

You can use other columns in the CSV file to split the histograms across rows using the --index_label argument, and/or you can create multiple histograms per row (as in the compartment type histograms above) with the --hist_split_label argument.

For example:

skelekeys-profiles-from-coords \
--coordinate_file aligned_coord_example.csv \
--layer_depths_file avg_layer_depths.json \
--depth_label y \
--index_label cell_id \
--hist_split_label target_cell_type \
--output_hist_file aligned_coord_hist.csv