Full Analysis Example

This notebook outlines the essential steps in the analysis with eggplant, using a data set of the Mouse Olfactory Bulb (MOB).

Load necessary packages

[1]:
import matplotlib.pyplot as plt
import scanpy as sc

import os.path as osp
import os
from PIL import Image

import anndata as ad
import pandas as pd
import sys

import eggplant as eg


Set plotting parameters

[2]:
from matplotlib import rcParams
rcParams["figure.facecolor"] = "None"
rcParams["axes.facecolor"] = "None"

Load and process Reference Information

Define paths to data and reference images, instructions for how to access this data can be found at the - to this package - associated github repository

[3]:
DATA_DIR = "../../../data/mob/"
REF_IMG_PTH = "../../../data/mob/references/reference.png"
REF_LMK_PTH = "../../../data/mob/references/landmarks.tsv"

Load reference image

[4]:
ref_img = Image.open(REF_IMG_PTH)
ref_lmk = pd.read_csv(REF_LMK_PTH,sep="\t",header = 0,index_col =0)

Inspect the reference image to make sure we loaded the correct image. Also plot the landmark coordinates to ensure that these match the reference.

[5]:
plt.figure(figsize = (5,5))
lmk_cmap = eg.constants.LANDMARK_CMAP
cmap = eg.pl.ColorMapper(lmk_cmap)

plt.imshow(ref_img,alpha = 0.7)
plt.scatter(ref_lmk.values[:,0],
            ref_lmk.values[:,1],
            marker = "*",
            edgecolor = "white",
            c = cmap(ref_lmk),
            s = 400,
           )
plt.axis("off")
plt.show()
../_images/notebooks_mouse-olfactory-bulb-analysis_11_0.png

Everything seems fine with the reference and landmark coordinates. We therefore proceed to convert the image to a grid with associated meta data (which region each grid point belongs to). The number of points in the grid will be close to n_approx_points, the more points, the higher the resolution. The argument n_regionsspecifies how many regions we have defined, by different colors, in our reference.

[6]:
grid_crd,mta = eg.pp.reference_to_grid(ref_img,
                                    n_approx_points=1000,
                                    n_regions=1,
                                   )

We also plot the newly created grid to make sure it looks as we’d expect it to. We can again overlay the landmarks.

[7]:
plt.scatter(grid_crd[:,0],
            grid_crd[:,1],
            c= mta,
            cmap = plt.cm.binary_r,
           )

plt.scatter(ref_lmk.values[:,0],
            ref_lmk.values[:,1],
            marker = "*",
            c = "red",
            s = 200,
           )

plt.axis("equal")
plt.axis("off")
plt.show()
../_images/notebooks_mouse-olfactory-bulb-analysis_15_0.png

From the grid and landmarks we may now create a Reference object, which will be used in the transfer process.

[8]:
ref = eg.m.Reference(grid_crd,
                     landmarks = ref_lmk.values,
                     meta = dict(region = mta),
                    )

Load and process Expression Data

First, we load the anndata files from the designated data directory DATA_DIR

[9]:
adatas = {p.split(".")[0]:ad.read_h5ad(osp.join(DATA_DIR,"curated",p)) for p in os.listdir(osp.join(DATA_DIR,"curated"))}
adatas = {f"Rep{k}_MOB" : adatas[f"Rep{k}_MOB"] for k \
          in sorted([ int(a.lstrip("Rep_").rstrip("_MOB")) for a in adatas.keys()])}

Next, we execute some standard pre-processing steps such as filter, normalize, log-transform, and scaling. However, we also match the scales between the observed data and the reference (match_scales), after which we compute the distance for every spot to each landmark (get_landmark_distance). We supply the aforementioned function with the reference object, to conduct TPS warping and thus account for non-homogenous disruptions in the morphology.

[10]:
for adata in adatas.values():
    eg.pp.default_normalization(adata,
                                min_cells = 0.1,
                                total_counts = 1e4,
                                exclude_highly_expressed=False)

    eg.pp.match_scales(adata,ref)
    eg.pp.get_landmark_distance(adata,
                                reference=ref)

    eg.pp.spatial_smoothing(adata)

As is proper, we then inspect our processed data to see that everything looks as expected.

[11]:
eg.pl.visualize_observed(adatas,
                         features = "Nrgn",
                         n_rows = 2,
                         include_title = True,
                         fontsize = 20,
                         marker_size =[50] * 5 + [80]*7,
                         share_colorscale = False,
                         separate_colorbar = False,
                         colorbar_fontsize = 10,
                         side_size = 4,
                         show_landmarks = False,
                         quantile_scaling = True,
                        )
../_images/notebooks_mouse-olfactory-bulb-analysis_24_0.png

We now transfer our data to the reference using the transfer_to_reference function. Here we use the gene Nrgn as our target of interest, but this may be exchanged for any of the other two genes presented in the supplementary (or any gene of the users liking). We use \(1000\) epochs and cuda acceleration, if possible. We also set verbose=True in order to be able to follow the progress.

[12]:
losses = eg.fun.transfer_to_reference(adatas,
                                      reference=ref,
                                      layer = "smoothed",
                                      features = ["Nrgn","Apoe","Omp"],
                                      n_epochs=1000,
                                      device ="cuda",
                                      verbose=True,
                                     )
[Processing] ::  Model : Rep1_MOB | Feature : Nrgn | Transfer : 1/36
/home/alma.andersson/miniconda3/envs/eggplant2/lib/python3.8/site-packages/eggplant-0.1-py3.8.egg/eggplant/models.py:69: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
/home/alma.andersson/miniconda3/envs/eggplant2/lib/python3.8/site-packages/eggplant-0.1-py3.8.egg/eggplant/models.py:70: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:09<00:00, 100.69it/s]
[Processing] ::  Model : Rep1_MOB | Feature : Apoe | Transfer : 2/36
/home/alma.andersson/miniconda3/envs/eggplant2/lib/python3.8/site-packages/anndata/_core/anndata.py:120: ImplicitModificationWarning: Transforming to str index.
  warnings.warn("Transforming to str index.", ImplicitModificationWarning)
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 77.18it/s]
[Processing] ::  Model : Rep1_MOB | Feature : Omp | Transfer : 3/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:13<00:00, 74.77it/s]
[Processing] ::  Model : Rep2_MOB | Feature : Nrgn | Transfer : 4/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:13<00:00, 76.27it/s]
[Processing] ::  Model : Rep2_MOB | Feature : Apoe | Transfer : 5/36
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:05<00:00, 183.44it/s]
[Processing] ::  Model : Rep2_MOB | Feature : Omp | Transfer : 6/36
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:05<00:00, 193.89it/s]
[Processing] ::  Model : Rep3_MOB | Feature : Nrgn | Transfer : 7/36
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:06<00:00, 146.23it/s]
[Processing] ::  Model : Rep3_MOB | Feature : Apoe | Transfer : 8/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 77.32it/s]
[Processing] ::  Model : Rep3_MOB | Feature : Omp | Transfer : 9/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 79.00it/s]
[Processing] ::  Model : Rep4_MOB | Feature : Nrgn | Transfer : 10/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 88.35it/s]
[Processing] ::  Model : Rep4_MOB | Feature : Apoe | Transfer : 11/36
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:08<00:00, 113.84it/s]
[Processing] ::  Model : Rep4_MOB | Feature : Omp | Transfer : 12/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 90.56it/s]
[Processing] ::  Model : Rep5_MOB | Feature : Nrgn | Transfer : 13/36

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 77.43it/s]
[Processing] ::  Model : Rep5_MOB | Feature : Apoe | Transfer : 14/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 79.67it/s]
[Processing] ::  Model : Rep5_MOB | Feature : Omp | Transfer : 15/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 81.19it/s]
[Processing] ::  Model : Rep6_MOB | Feature : Nrgn | Transfer : 16/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 86.08it/s]
[Processing] ::  Model : Rep6_MOB | Feature : Apoe | Transfer : 17/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 78.66it/s]
[Processing] ::  Model : Rep6_MOB | Feature : Omp | Transfer : 18/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 80.73it/s]
[Processing] ::  Model : Rep7_MOB | Feature : Nrgn | Transfer : 19/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 80.81it/s]
[Processing] ::  Model : Rep7_MOB | Feature : Apoe | Transfer : 20/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 78.10it/s]
[Processing] ::  Model : Rep7_MOB | Feature : Omp | Transfer : 21/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 87.55it/s]
[Processing] ::  Model : Rep8_MOB | Feature : Nrgn | Transfer : 22/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 89.89it/s]
[Processing] ::  Model : Rep8_MOB | Feature : Apoe | Transfer : 23/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 89.07it/s]
[Processing] ::  Model : Rep8_MOB | Feature : Omp | Transfer : 24/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 77.77it/s]
[Processing] ::  Model : Rep9_MOB | Feature : Nrgn | Transfer : 25/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 84.39it/s]
[Processing] ::  Model : Rep9_MOB | Feature : Apoe | Transfer : 26/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 81.57it/s]
[Processing] ::  Model : Rep9_MOB | Feature : Omp | Transfer : 27/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:10<00:00, 94.29it/s]
[Processing] ::  Model : Rep10_MOB | Feature : Nrgn | Transfer : 28/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:10<00:00, 94.86it/s]
[Processing] ::  Model : Rep10_MOB | Feature : Apoe | Transfer : 29/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 89.54it/s]
[Processing] ::  Model : Rep10_MOB | Feature : Omp | Transfer : 30/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 78.62it/s]
[Processing] ::  Model : Rep11_MOB | Feature : Nrgn | Transfer : 31/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:10<00:00, 91.25it/s]
[Processing] ::  Model : Rep11_MOB | Feature : Apoe | Transfer : 32/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:13<00:00, 75.92it/s]
[Processing] ::  Model : Rep11_MOB | Feature : Omp | Transfer : 33/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 86.27it/s]
[Processing] ::  Model : Rep12_MOB | Feature : Nrgn | Transfer : 34/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:13<00:00, 74.41it/s]
[Processing] ::  Model : Rep12_MOB | Feature : Apoe | Transfer : 35/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:12<00:00, 82.72it/s]
[Processing] ::  Model : Rep12_MOB | Feature : Omp | Transfer : 36/36
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 85.15it/s]

Having fitted our models, we now inspect their respective losses - to control for any unexpected abberrations - and find that all our curves look as we would expect them to as well as seeming to have converged.

[19]:
eg.pl.model_diagnostics(losses = losses)
../_images/notebooks_mouse-olfactory-bulb-analysis_28_0.png

Finally, we visualize our results

[20]:
eg.pl.visualize_transfer(ref,
                         n_cols = 6,
                         attributes = "Nrgn",
                         include_title = True,
                         fontsize = 15,
                         marker_size =35,
                         share_colorscale = True,
                         separate_colorbar = False,
                         colorbar_fontsize = 20,
                         side_size = 4,
                         show_landmarks = False,
                         flip_y = True,
                         quantile_scaling = False,
                        )
../_images/notebooks_mouse-olfactory-bulb-analysis_30_0.png

Next, we conduct a spatial differential gene expression analysis (SDEA), with respect to the gene expression. If the interval of \([\mu - 2\sigma, \mu + 2\sigma]\) between any two features overalap at a spatial location, we consider their expression not significantly differentially expressed.

[21]:
res = eg.sdea.sdea(ref.adata,
                   group_col="feature",
                   n_std=2,
                  )

We visualize the results from our SDEA, gray areas indicate locations with no significant differential expression, color regions is the difference between the two compared features’ expression

[22]:
fig,ax = eg.pl.visualize_sdea_results(ref,
                                     res,
                                     n_cols = 3,
                                     marker_size = 50,
                                     side_size = 6,
                                     colorbar_orientation ="horizontal",
                                     title_fontsize = 20,
                                     colorbar_fontsize = 20,
                                     no_sig_color = "darkgray",
                                     reorder_axes = [0,2,1],
                                    )
../_images/notebooks_mouse-olfactory-bulb-analysis_34_0.png

Finally, we generate composite representations and visualize them

[23]:
ref.composite_representation(by = "feature")
[24]:
eg.pl.visualize_transfer(ref,
                         attributes = "composite",
                         n_cols = 3,
                         include_title = True,
                         fontsize = 20,
                         colorbar_fontsize = 20,
                         marker_size =62,
                         share_colorscale = True,
                         separate_colorbar = False,
                         side_size = 6,
                         show_landmarks = False,
                         flip_y = True,
                         quantile_scaling = False,
                        )
../_images/notebooks_mouse-olfactory-bulb-analysis_37_0.png