Tutorial: scCAMEL-SWAPLINE Mouse Dentate Gyrus to Human Glioblastoma

This tutorial demonstrates the SWAPLINE workflow with the published PyPI scCAMEL package. It trains a reference neural-network classifier on mouse dentate gyrus clusters, saves/reloads the model, and predicts human glioblastoma lineage-like cluster signals.

Original Article: Neural network learning defines glioblastoma features to be of neural crest perivascular or radial glia lineages,”Sci. Adv.”, 2022

Package: scCAMEL from PyPI, tested here with version 0.47b0.

Method: scCAMEL-SWAPLINE.v1

Author: Yizhou Hu, Research Group: Ernfors lab

Link of the datasets: Mouse Dentate Gyrus, Human glioblastoma, Dataset references: Hochgerner and Zeisel, et al., Couturier, et al.

Resource gene listcell cycle genes, Homologene-HumanMouse

Training

[1]:
import datetime
today=f"{datetime.datetime.now():%Y-%m-%d}"
today
[1]:
'2026-06-09'
[2]:
import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.utils.data as Data
import torchvision
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import torch.utils.data as data_utils
from matplotlib import cm
import numpy as np
import pandas as pd
import pickle as pickle
from scipy.spatial.distance import cdist, pdist, squareform
import pandas as pd
from sklearn.linear_model import LogisticRegression, LogisticRegressionCV
from sklearn.model_selection import StratifiedShuffleSplit
from collections import defaultdict
from sklearn import preprocessing
import matplotlib.patches as mpatches
import torch.nn.functional as F
import math
#import gpytorch

import urllib.request
import os.path
import os
import sys
import importlib.metadata as importlib_metadata
from pathlib import Path
from scipy.io import loadmat
from math import floor
import anndata

PROJECT_ROOT = Path("/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse")
PUBLIC_DATASET = Path("/mnt/f/Dropbox/data/proj/PE_HYZ/PublicDataSet")
VERSION_ROOT = Path("/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605")
OUTPUT_DIR = VERSION_ROOT / "outputs" / "swapline_pypi047b0_tutorial"
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
os.chdir(PROJECT_ROOT)

# Make plots inline
%pylab inline

%pylab is deprecated, use %matplotlib inline and import the required libraries.
Populating the interactive namespace from numpy and matplotlib
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/IPython/core/magics/pylab.py:162: UserWarning: pylab import has clobbered these variables: ['floor']
`%matplotlib` prevents importing * from pylab and numpy
  warn("pylab import has clobbered these variables: %s"  % clobbered +
[3]:
torch.manual_seed(1)    # reproducible
[3]:
<torch._C.Generator at 0x7b5b057d6410>
[4]:
import scCAMEL as scm
from scCAMEL import CamelPrefiltering
from scCAMEL import CamelSwapline
from scCAMEL import CamelEvo

SC_CAMEL_VERSION = importlib_metadata.version("scCAMEL")
print("Using installed scCAMEL:", SC_CAMEL_VERSION)
print("scCAMEL import path:", Path(scm.__file__).parent)
print("Tutorial outputs:", OUTPUT_DIR)

Using installed scCAMEL: 0.47b0
scCAMEL import path: /home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL
Tutorial outputs: /mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial
[5]:
os.chdir(PROJECT_ROOT)
Path.cwd()

[5]:
PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')
[6]:
scref=anndata.read(PROJECT_ROOT / "ZeiselDentateGyrus_Ref2023-05-27.h5ad")
scref

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/__init__.py:42: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024.
  warnings.warn(
[6]:
AnnData object with n_obs × n_vars = 5454 × 14545
    obs: 'Cluster', 'Color'
[7]:
set(scref.obs["Cluster"])
[7]:
{'Astrocytes',
 'Cajal-Retzius',
 'Cck-Tox',
 'Endo',
 'GABA',
 'Granule',
 'Microglia',
 'Mossy',
 'NFOL',
 'Neuroblast',
 'OLIG',
 'OPC',
 'PVM',
 'Peri/VLMC',
 'nIPC/Rgl'}
[8]:
scref.obs.groupby(["Cluster"]).count()
/tmp/ipykernel_44346/1119954423.py:1: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  scref.obs.groupby(["Cluster"]).count()
[8]:
Color
Cluster
Astrocytes 335
Cajal-Retzius 93
Cck-Tox 27
Endo 165
GABA 99
Granule 3045
Microglia 169
Mossy 137
NFOL 35
Neuroblast 874
OLIG 79
OPC 66
PVM 18
Peri/VLMC 59
nIPC/Rgl 253
[9]:
#if the matrix is sparse matrix
#screfall.X=screfall.X.todense()
[10]:

# Apply the function to balance clusters and label up-sampled items scref= scm.CamelPrefiltering.balance_clusters_adata(scref, 'Cluster',5) scref
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/anndata.py:1756: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")
[10]:
AnnData object with n_obs × n_vars = 13371 × 14545
    obs: 'Cluster', 'Color', 'upsampled'
[11]:
scref.obs.groupby(["Cluster"]).count()
/tmp/ipykernel_44346/1119954423.py:1: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  scref.obs.groupby(["Cluster"]).count()
[11]:
Color upsampled
Cluster
Astrocytes 944 944
Cajal-Retzius 702 702
Cck-Tox 636 636
Endo 774 774
GABA 708 708
Granule 3045 3045
Microglia 778 778
Mossy 746 746
NFOL 644 644
Neuroblast 874 874
OLIG 688 688
OPC 675 675
PVM 627 627
Peri/VLMC 668 668
nIPC/Rgl 862 862

Prefiltering_and_SelectFeatures

[12]:
# if needed scref.X=scref.X.todense()
scref.X=scref.X.todense()
scref=scm.CamelPrefiltering.DataScaling(scref)
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/storage.py:39: ImplicitModificationWarning: X should not be a np.matrix, use np.ndarray instead.
  warnings.warn(msg, ImplicitModificationWarning)
[13]:
dfpdt=pd.DataFrame(scref.X.T,index=scref.var.index,columns=scref.obs.index)
dfpdt.shape
[13]:
(14545, 13371)
[14]:
path=str(PUBLIC_DATASET) + "/"
dictfilename1="Homologene_mouse2human_dict2.pickle"
dfpdt= scm.CamelPrefiltering.TransSpeciesGeneName(dfm=dfpdt, dictfilename=dictfilename1, path=path)
samegene=set(dfpdt.index)
len(samegene)

[14]:
12516
[15]:
dfpdt
[15]:
10X46_1_ACTCTATGGTACGT-1 10X43_1_AGGGACGATCTCCG-1 10X46_1_GACAACTGTTGACG-1 10X46_1_CTGTAACTGGTCTA-1 10X43_1_TGCAAGTGTCTCCG-1 10X43_1_GTAACGTGCCTAAG-1 10X43_1_ATCACGGAGGAGTG-1 10X43_1_AATTGTGAGTTGCA-1 10X43_1_TGGTCAGACTATTC-1 10X43_1_ACAGTCGAGGGACA-1 ... 10X43_1_CTTTCAGAAACCAC-1 10X43_1_ATAACCCTCCTCAC-1 10X43_1_GCAGGCACACCCAA-1 10X43_1_CTTTCAGAAACCAC-1 10X43_1_CAACGAACCAGTCA-1 10X46_1_TGAACCGAATGCCA-1 10X43_1_GCAGGCACACCCAA-1 10X46_1_CTCGCATGTATCTC-1 10X43_1_GTATTCACCTAGTG-1 10X46_1_AAGACAGACAGTTG-1
A2M 0.0 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 ... 0.00000 0.000000 0.0 0.00000 0.000000 0.0 0.0 0.0 0.000000 0.0
AAAS 0.0 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 ... 0.00000 0.000000 0.0 0.00000 0.000000 0.0 0.0 0.0 0.000000 0.0
AACS 0.0 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 ... 0.00000 0.000000 0.0 0.00000 0.000000 0.0 0.0 0.0 0.000000 0.0
AAED1 0.0 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 ... 0.57404 0.000000 0.0 0.57404 0.000000 0.0 0.0 0.0 0.000000 0.0
AAGAB 0.0 0.000000 0.700669 0.826938 0.000000 1.424036 0.0 0.0 0.0 0.0 ... 0.00000 1.042694 0.0 0.00000 0.000000 0.0 0.0 0.0 1.374609 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
ZXDC 0.0 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 ... 0.00000 0.000000 0.0 0.00000 0.000000 0.0 0.0 0.0 0.000000 0.0
ZYG11B 0.0 0.000000 1.401339 0.826938 1.417607 0.000000 0.0 0.0 0.0 0.0 ... 0.00000 0.000000 0.0 0.00000 0.000000 0.0 0.0 0.0 0.000000 0.0
ZYX 0.0 0.000000 0.000000 0.000000 0.708804 0.000000 0.0 0.0 0.0 0.0 ... 0.00000 1.042694 0.0 0.00000 1.837025 0.0 0.0 0.0 0.000000 0.0
ZZEF1 0.0 0.696231 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 ... 0.57404 0.000000 0.0 0.57404 0.000000 0.0 0.0 0.0 0.000000 0.0
ZZZ3 0.0 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 ... 0.00000 0.000000 0.0 0.00000 0.000000 0.0 0.0 0.0 0.000000 0.0

12516 rows × 13371 columns

[16]:
scref2= anndata.AnnData(dfpdt.T.astype(float))
scref2
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/anndata.py:1756: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")
[16]:
AnnData object with n_obs × n_vars = 13371 × 12516
[17]:
scref2.obs=scref.obs
[18]:
scref=scref2.copy()
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/anndata.py:1756: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")
[19]:
path=str(PUBLIC_DATASET) + '/'
filename='PANTHER_cell_cycle_genes.txt'
scref= scm.CamelPrefiltering.prefilter(datax=scref,filename=filename, path=path)

CamelRunning_Prefilter......
CamelRunning_Prefilter......Finished
[20]:
scref=scm.CamelPrefiltering.DataScaling(scref)
[21]:
scref.obs
[21]:
Cluster Color upsampled
10X46_1_ACTCTATGGTACGT-1 Granule #b48c82 False
10X43_1_AGGGACGATCTCCG-1 Granule #b48c82 False
10X46_1_GACAACTGTTGACG-1 Granule #b48c82 False
10X46_1_CTGTAACTGGTCTA-1 Granule #b48c82 False
10X43_1_TGCAAGTGTCTCCG-1 Granule #b48c82 False
... ... ... ...
10X46_1_TGAACCGAATGCCA-1 PVM #8b6564 True
10X43_1_GCAGGCACACCCAA-1 PVM #8b6564 True
10X46_1_CTCGCATGTATCTC-1 PVM #8b6564 True
10X43_1_GTATTCACCTAGTG-1 PVM #8b6564 True
10X46_1_AAGACAGACAGTTG-1 PVM #8b6564 True

13371 rows × 3 columns

[22]:
scref=scm.CamelPrefiltering.SelectFeatures(datax=scref, clustername='Cluster',methodname='Enrichment_shortcut', numbergenes=50, folderchange=1.5)
Camel...Running: clusteringValue1...
Camel...Running: clusteringValue2...
[Processing]-0%--6%--13%--20%--26%--33%--40%--46%--53%--60%--66%--73%--80%--86%--93%-Camel...Running: CrossChecking...
Camel...Running: output genelist...
[23]:
len(scref.var.index[scref.var["MVgene"]])
[23]:
2246
[24]:
scref2=scref.copy()
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/anndata.py:1756: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")
[25]:
########################################################
########################################################
#remeber to change the file path in tftable
########################################################
scref =scm.CamelPrefiltering.LabelGene_Scaling(datax=scref2,
                                                                TPTT=100000,     mprotogruop=scref2.obs["Cluster"].values, commongene=None,
                                                                                              sharedMVgenes=None, std_scaling=True,
    tftable=str(PUBLIC_DATASET / "FantomTF2CLUSTER_human_official.txt"), learninggroup="train")

CamelRunning---GenesScaling......
CamelRunning---TrainingGenesScaling......Finished
[26]:
scref
[26]:
AnnData object with n_obs × n_vars = 13371 × 12516
    obs: 'Cluster', 'Color', 'upsampled', 'mtrain_index'
    var: 'Filter1', 'MVgene', 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names'
    obsm: 'train_set_values'

Neural-Network learning

[27]:
net=scm.CamelPrefiltering.NNclassifer(
   datax=scref,
    epochNum=150,
    learningRate=0.005,
    verbose=0,
    optimizerMmentum=0.8,
    dropout=0.3,
    #imizer__nesterov=True,
    )
CamelRunning---NNclasffier_in_cpu.......
CamelRunning---NNclasffier_in_cpu.......Finished

Accuracy plot, the overall clustering accuracy is ~95%

[28]:
ax=scm.CamelPrefiltering.AccuracyPlot( nnModel=net, accCutoff=0.95,
                 Xlow=-1, Ylow=0.0, Yhigh=1,
               )
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_37_0.png

Save and Reload the Trained Model

Note: The published scCAMEL 0.47b0 package includes CamelSwapline.save_camel_model and CamelSwapline.load_camel_model, so the tutorial uses those package helpers instead of redefining model serialization inside the notebook.

[29]:
Path.cwd()

[29]:
PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')

Tutorial note: The checkpoint files are written to OUTPUT_DIR / "camel_checkpoints2" so they stay with the rest of the tutorial outputs.

[30]:
Path.cwd()

[30]:
PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')
[31]:
# after net.fit(...)
paths = scm.CamelSwapline.save_camel_model(scref, net, out_dir=str(OUTPUT_DIR / "camel_checkpoints2"), prefix="camel_nn_mouseDG")
print("Saved files:", paths)

Saved files: {'meta': '/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial/camel_checkpoints2/camel_nn_mouseDG_meta.json', 'weights': '/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial/camel_checkpoints2/camel_nn_mouseDG_weights.pt', 'history': '/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial/camel_checkpoints2/camel_nn_mouseDG_history.json'}
[32]:
net3= scm.CamelSwapline.load_camel_model(checkpoint_dir=str(OUTPUT_DIR / "camel_checkpoints2"), prefix="camel_nn_mouseDG",dropoutVal=0.3, device="cpu")
[52]:

[33]:
ax=scm.CamelPrefiltering.AccuracyPlot( nnModel=net3, accCutoff=0.95,
                 Xlow=-1, Ylow=0.0, Yhigh=1,
               )
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_45_0.png
[ ]:

Make predition and visualization in Radar plot

[34]:
net=scm.CamelPrefiltering.NNclassifer(
   datax=scref,
    epochNum=40,
    learningRate=0.005,
    verbose=0,
    optimizerMmentum=0.8,
    dropout=0.3,
    #imizer__nesterov=True,
    )
CamelRunning---NNclasffier_in_cpu.......
CamelRunning---NNclasffier_in_cpu.......Finished
[35]:
scref
[35]:
AnnData object with n_obs × n_vars = 13371 × 12516
    obs: 'Cluster', 'Color', 'upsampled', 'mtrain_index'
    var: 'Filter1', 'MVgene', 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names'
    obsm: 'train_set_values'
[36]:
#if color is not defined: scref.obs[ 'color']
predefined_colors = pd.Series({
'Astrocytes':   [190,  10,  10],'Cajal-Retzius': [225, 160,  30],'Cck-Tox':    [217, 215,   7],
             'Endo':    [170, 180, 170], 'GABA':   [130, 140, 140],'Granule':    [180, 140, 130],
             'Microglia':  [100, 100, 240],'Mossy': [ 80, 235, 255],'NFOL':[190, 235, 255],
              'Neuroblast':[210, 255, 215],'OLIG':[230, 140, 120], 'OPC':  [255, 195,  28],
              'PVM':  [139, 101, 100],'Pericytes':  [252, 183,  26],'Radial Glia-like':   [214, 194,  39],
              'VLMC':  [255, 120, 155],'nIPC': [250, 145,  45],'hRgl2a':  [250, 125,  25],
              'hDA0':    [190, 200, 190],'hOPC':   [255,  35, 155],'hRN':     [199, 121,  41],
              'hNbGaba': [ 40,  55, 130],'hGaba':  [  7,  121, 61],'hOMTN':   [ 95, 186,  70],
              'hSert':   [ 50, 180, 180],'nIPC/Rgl':   [245, 205, 170], 'Peri/VLMC':   [185, 245, 30],
              'eSCc':[205,205,220]
})
[37]:
#if color is not defined
#del scref.obs["color"]
#scref=scm.CamelSwapline.addcolor(datax=scref,clustername="Cluster", colorcode="color")
scref = scm.CamelSwapline.add_color2(scref, clustername="Cluster", colorcode="color",predef=predefined_colors)

[38]:
#if want to define your own order
#scref.uns["mwanted_order"] =[ 'Mossy', 'Cajal-Retzius', 'Cck-Tox', 'GABA',  'Endo', 'Peri/VLMC', 'PVM', 'Microglia', 'Astrocytes', 'OLIG',
 #'NFOL', 'OPC', 'nIPC/Rgl','Neuroblast','Granule']

scref.uns["mwanted_order"] =list(sort(list(set(scref.obs["Cluster"]))))


[39]:
#radar  plot
scref=scm.CamelSwapline.prediction(datax=scref, mcolor_dict=scref.uns["refcolor_dict"] ,net=net,learninggroup="train", radarplot=True,fontsizeValue=10,
                       ncolnm=3, bbValue=(1.5, 1.55)  )
#plt.savefig("upload_%s_RadarPlot_cluster.pdf"%today,bbox_inches='tight')
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_53_0.png
[40]:
#radar  plot
scref=scm.CamelSwapline.prediction(datax=scref, mcolor_dict=scref.uns["refcolor_dict"] ,net=net3,learninggroup="train", radarplot=True,fontsizeValue=10,
                       ncolnm=3, bbValue=(1.5, 1.55)  )
#plt.savefig("upload_%s_RadarPlot_cluster.pdf"%today,bbox_inches='tight')
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_54_0.png

permutation control

[41]:
## the whole data matrix is randomized, the red X represents 95% conficence of each cell-type
[42]:
dftest0, ratiodf=scm.CamelSwapline.permutationTest(datax=scref,net=net,num=20, plotshow=True)
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: UserWarning:
The palette list has fewer values (1) than needed (15) and will cycle, which may produce an uninterpretable plot.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `scale` parameter has been renamed and will be removed in v0.15.0. Pass `density_norm='width'` for the same effect.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `scale_hue` parameter has been replaced and will be removed in v0.15.0. Pass `common_norm=True` for the same effect.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `bw` parameter is deprecated in favor of `bw_method`/`bw_adjust`.
Setting `bw_method=0.4`, but please see docs for the new parameters
and update your code. This will become an error in seaborn v0.15.0.

  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
<Figure size 640x480 with 0 Axes>
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_57_2.png
[43]:
dftest0, ratiodf=scm.CamelSwapline.permutationTest(datax=scref,net=net3,num=20, plotshow=True)
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: UserWarning:
The palette list has fewer values (1) than needed (15) and will cycle, which may produce an uninterpretable plot.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `scale` parameter has been renamed and will be removed in v0.15.0. Pass `density_norm='width'` for the same effect.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `scale_hue` parameter has been replaced and will be removed in v0.15.0. Pass `common_norm=True` for the same effect.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `bw` parameter is deprecated in favor of `bw_method`/`bw_adjust`.
Setting `bw_method=0.4`, but please see docs for the new parameters
and update your code. This will become an error in seaborn v0.15.0.

  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
<Figure size 640x480 with 0 Axes>
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_58_2.png

Cell_Type Purity

[44]:
#The ratio of the purity entropy for each cluster based on their learning scores, is used as a measure of purity.
#The function returns a pandas dataframe sorted by the purity score
[45]:
dfpurity1=scm.CamelSwapline.PurityEstimationLearningScore(datax=scref, clusterlist="Cluster", elbow=False, figureplot=True)
<Figure size 640x480 with 0 Axes>
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_61_1.png

association between cell-types

[46]:
scref
[46]:
AnnData object with n_obs × n_vars = 13371 × 12516
    obs: 'Cluster', 'Color', 'upsampled', 'mtrain_index', 'color'
    var: 'Filter1', 'MVgene', 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names', 'refcolor_dict', 'mwanted_order', 'Celltype_Score_RefCellType', 'Celltype_OrderNumber'
    obsm: 'train_set_values', 'Celltype_Score', 'CelltypeScoreCoordinates'
[47]:
# the heatmap of hierarchical clustering represents the cell-type similarity or association
#color from dark purple to light yellow represents the association from low to high
#number inside of eahc square indicating the association value.
[48]:
scm.CamelSwapline.CellTypeSimilarity(datax=scref, labelnum=True,  metricvalue='correlation',methodvalue="complete")
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:2472: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  dfpb2 = dfprob.groupby(["Cluster"]).mean()
<Figure size 1500x1500 with 0 Axes>
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_65_2.png
[49]:
scm.CamelSwapline.CellTypeSimilarity(datax=scref, labelnum=False,  metricvalue='correlation',methodvalue="complete")
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:2472: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  dfpb2 = dfprob.groupby(["Cluster"]).mean()
<Figure size 1500x1500 with 0 Axes>
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_66_2.png
[ ]:

Save data

[50]:
scref
[50]:
AnnData object with n_obs × n_vars = 13371 × 12516
    obs: 'Cluster', 'Color', 'upsampled', 'mtrain_index', 'color'
    var: 'Filter1', 'MVgene', 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names', 'refcolor_dict', 'mwanted_order', 'Celltype_Score_RefCellType', 'Celltype_OrderNumber'
    obsm: 'train_set_values', 'Celltype_Score', 'CelltypeScoreCoordinates'
[51]:
Path.cwd()

[51]:
PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')
[52]:
scref.uns["refcolor_dict"]= predefined_colors
[53]:
work_dir=str(OUTPUT_DIR)
QueryName="ZeiselMouseDG"
TrainingName="ZeiselMouseDG"
filename="%s_%s_Ref%s_MergeCluster.h5ad"%(QueryName,TrainingName,today)

[54]:
os.path.join(work_dir,filename)
[54]:
'/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial/ZeiselMouseDG_ZeiselMouseDG_Ref2026-06-09_MergeCluster.h5ad'
[55]:
del scref.uns['refcolor_dict']
[56]:
CamelSwapline.write_data(adatax=scref,filename=filename,filepath=work_dir)
[57]:
#if color is not defined: scref.obs[ 'color']
scref.uns['refcolor_dict'] = pd.Series({
'Astrocytes':   [190,  10,  10],'Cajal-Retzius': [225, 160,  30],'Cck-Tox':    [217, 215,   7],
             'Endo':    [170, 180, 170], 'GABA':   [130, 140, 140],'Granule':    [180, 140, 130],
             'Microglia':  [100, 100, 240],'Mossy': [ 80, 235, 255],'NFOL':[190, 235, 255],
              'Neuroblast':[210, 255, 215],'OLIG':[230, 140, 120], 'OPC':  [255, 195,  28],
              'PVM':  [139, 101, 100],'Pericytes':  [252, 183,  26],'Radial Glia-like':   [214, 194,  39],
              'VLMC':  [255, 120, 155],'nIPC': [250, 145,  45],'hRgl2a':  [250, 125,  25],
              'hDA0':    [190, 200, 190],'hOPC':   [255,  35, 155],'hRN':     [199, 121,  41],
              'hNbGaba': [ 40,  55, 130],'hGaba':  [  7,  121, 61],'hOMTN':   [ 95, 186,  70],
              'hSert':   [ 50, 180, 180],'nIPC/Rgl':   [245, 205, 170], 'Peri/VLMC':   [185, 245, 30],
              'eSCc':[205,205,220]
})
[ ]:

[58]:
os.chdir(PROJECT_ROOT)
Path.cwd()

[58]:
PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')
[59]:
import scanpy as sc
[60]:
scref=sc.read(OUTPUT_DIR / filename)
scref

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/anndata.py:1756: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")
[60]:
AnnData object with n_obs × n_vars = 13371 × 12516
    obs: 'Cluster', 'Color', 'upsampled', 'mtrain_index', 'color'
    var: 'Filter1', 'MVgene', 'RefGeneList'
    uns: 'Celltype_OrderNumber', 'Celltype_Score_RefCellType', 'mclasses_names', 'mwanted_order', 'train_set_gene'
    obsm: 'CelltypeScoreCoordinates', 'Celltype_Score', 'train_set_values'
[61]:
net3= scm.CamelSwapline.load_camel_model(checkpoint_dir=str(OUTPUT_DIR / "camel_checkpoints2"), prefix="camel_nn_mouseDG",dropoutVal=0.3, device="cpu")

Prediction

Couturier2020_humanGlioblastoma

[62]:
os.chdir(PROJECT_ROOT)
Path.cwd()

[62]:
PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')
[63]:
scpdt=anndata.read(PROJECT_ROOT / "Couturier2020_DevGBM_Ref2023-05-27.h5ad")

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/__init__.py:42: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024.
  warnings.warn(
[64]:
set(scpdt.obs["Cluster"])
[64]:
{'Astro', 'Mesenchymal', 'Neuronal', 'Oligo', 'Progenitor', 'Unassigned'}
[65]:
scpdt
[65]:
AnnData object with n_obs × n_vars = 18475 × 33660
    obs: 'Patient', 'Cluster', 'Color'
[66]:
scpdt.var.index
[66]:
Index(['A1BG', 'A1BG-AS1', 'A1CF', 'A2M', 'A2M-AS1', 'A2ML1', 'A2ML1-AS1',
       'A2ML1-AS2', 'A3GALT2', 'A4GALT',
       ...
       'ZXDC', 'ZYG11A', 'ZYG11B', 'ZYX', 'ZZEF1', 'ZZZ3', 'bP-21264C1.2',
       'bP-2171C21.3', 'bP-2189O9.3', 'hsa-mir-1253'],
      dtype='object', length=33660)
[67]:
scpdt.X=scpdt.X.todense()
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/storage.py:39: ImplicitModificationWarning: X should not be a np.matrix, use np.ndarray instead.
  warnings.warn(msg, ImplicitModificationWarning)
[68]:
scpdt2=scpdt.copy()
scpdt2=scm.CamelPrefiltering.DataScaling(scpdt2)
[69]:
########################################################
########################################################
#remeber to change the file path in tftable
########################################################
########################################################
scpdt =scm.CamelPrefiltering.MVgene_Scaling(datax=scpdt2,TPTT=0,   commongene=scref.var.index.tolist(),
                                        sharedMVgenes=scref.uns[ 'train_set_gene'].tolist(),
                                                                                            std_scaling=True,score=None, thrs=None,  mprotogruop=None,
    tftable=str(PUBLIC_DATASET / "FantomTF2CLUSTER_human_official.txt"), learninggroup="test")
scpdt.uns["mwanted_order"] =list(sort(list(set(scpdt.obs["Cluster"]))))

CamelRunning---GenesScaling......
CamelRunning---TestGenesScaling......Finished
[70]:
scpdt
[70]:
AnnData object with n_obs × n_vars = 18475 × 33660
    obs: 'Patient', 'Cluster', 'Color', 'mtrain_index'
    var: 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names', 'mwanted_order'
    obsm: 'test_set_values'
[71]:
#del scpdt.obs["color"]
[72]:
# if color is not definedi
#scpdt=scm.CamelSwapline.addcolor(datax=scpdt,clustername="Cluster", colorcode="color")
[19]:

[73]:
scpdt.uns["refcolor_dict"] = pd.Series({'Astro': [100, 100, 240], 'Neuronal':   [ 0, 86,  255],
              'Mesenchymal':  [55, 120, 55], 'Oligo': [ 255,185, 5], 'Unassigned':  [192,192,192],
             'Progenitor':    [190, 0, 0]})
[74]:
test=scm.CamelSwapline.prediction(datax=scpdt, mcolor_dict=pd.Series(scpdt.uns["refcolor_dict"]),net=net3,
                                  learninggroup="test", radarplot=True, fontsizeValue=35,
                  datarefplot=scref,ncolnm=1, bbValue=(1.1, 1.05))
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_97_0.png
[75]:
scpdt
[75]:
AnnData object with n_obs × n_vars = 18475 × 33660
    obs: 'Patient', 'Cluster', 'Color', 'mtrain_index'
    var: 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names', 'mwanted_order', 'refcolor_dict', 'Celltype_Score_RefCellType', 'Celltype_OrderNumber'
    obsm: 'test_set_values', 'Celltype_Score', 'CelltypeScoreCoordinates'
[76]:
genename=sort(list(set(scpdt.obs["Cluster"])))
name=sort(list(set(scref.obs["Cluster"])))
[77]:
dfprob=pd.DataFrame(scpdt.obsm['Celltype_Score'])
dfprob.columns=scpdt.uns['Celltype_Score_RefCellType']
dfprob.index=scpdt.obs.index
dfmk=dfprob.astype(float).join(scpdt.obs["Cluster"],how="inner").T
dfprob=CamelSwapline.CellTypeSimilarityViolinPlot(datax=scpdt, dataref=scref)
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:883: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  fig.tight_layout()
../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_100_1.png

Save data

[78]:
Path.cwd()

[78]:
PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')
[79]:
scpdt
[79]:
AnnData object with n_obs × n_vars = 18475 × 33660
    obs: 'Patient', 'Cluster', 'Color', 'mtrain_index'
    var: 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names', 'mwanted_order', 'refcolor_dict', 'Celltype_Score_RefCellType', 'Celltype_OrderNumber'
    obsm: 'test_set_values', 'Celltype_Score', 'CelltypeScoreCoordinates'
[80]:
work_dir=str(OUTPUT_DIR)
QueryName="Couturier2020"
TrainingName="ZeiselMouseDG"
filename="%s_%s_Ref%s_MergeCluster.h5ad"%(QueryName,TrainingName,today)

[81]:
os.path.join(work_dir,filename)
[81]:
'/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial/Couturier2020_ZeiselMouseDG_Ref2026-06-09_MergeCluster.h5ad'
[82]:
del scpdt.uns["refcolor_dict"]
[83]:
CamelSwapline.write_data(adatax=scpdt,filename=filename,filepath=work_dir)
[ ]:

[ ]:

[ ]:

[ ]:

[ ]:

[ ]: