Tutorial: scCAMEL-SWAPLINE Mouse Dentate Gyrus to Human Glioblastoma

This tutorial demonstrates the SWAPLINE workflow with the published PyPI scCAMEL package. It trains a reference neural-network classifier on mouse dentate gyrus clusters, saves/reloads the model, and predicts human glioblastoma lineage-like cluster signals.

Original Article: Neural network learning defines glioblastoma features to be of neural crest perivascular or radial glia lineages,”Sci. Adv.”, 2022

Package: scCAMEL from PyPI, tested here with version 0.47b0.

Method: scCAMEL-SWAPLINE.v1

Author: Yizhou Hu, Research Group: Ernfors lab

Link of the datasets: Mouse Dentate Gyrus, Human glioblastoma, Dataset references: Hochgerner and Zeisel, et al., Couturier, et al.

Resource gene listcell cycle genes, Homologene-HumanMouse

Training

[1]:

import datetime
today=f"{datetime.datetime.now():%Y-%m-%d}"
today

[1]:

'2026-06-09'

[2]:

import torch
import torch.nn as nn
from torch.autograd import Variable
import torch.utils.data as Data
import torchvision
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import torch.utils.data as data_utils
from matplotlib import cm
import numpy as np
import pandas as pd
import pickle as pickle
from scipy.spatial.distance import cdist, pdist, squareform
import pandas as pd
from sklearn.linear_model import LogisticRegression, LogisticRegressionCV
from sklearn.model_selection import StratifiedShuffleSplit
from collections import defaultdict
from sklearn import preprocessing
import matplotlib.patches as mpatches
import torch.nn.functional as F
import math
#import gpytorch

import urllib.request
import os.path
import os
import sys
import importlib.metadata as importlib_metadata
from pathlib import Path
from scipy.io import loadmat
from math import floor
import anndata

PROJECT_ROOT = Path("/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse")
PUBLIC_DATASET = Path("/mnt/f/Dropbox/data/proj/PE_HYZ/PublicDataSet")
VERSION_ROOT = Path("/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605")
OUTPUT_DIR = VERSION_ROOT / "outputs" / "swapline_pypi047b0_tutorial"
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
os.chdir(PROJECT_ROOT)

# Make plots inline
%pylab inline

%pylab is deprecated, use %matplotlib inline and import the required libraries.
Populating the interactive namespace from numpy and matplotlib

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/IPython/core/magics/pylab.py:162: UserWarning: pylab import has clobbered these variables: ['floor']
`%matplotlib` prevents importing * from pylab and numpy
  warn("pylab import has clobbered these variables: %s"  % clobbered +

[3]:

torch.manual_seed(1)    # reproducible

[3]:

<torch._C.Generator at 0x7b5b057d6410>

[4]:

import scCAMEL as scm
from scCAMEL import CamelPrefiltering
from scCAMEL import CamelSwapline
from scCAMEL import CamelEvo

SC_CAMEL_VERSION = importlib_metadata.version("scCAMEL")
print("Using installed scCAMEL:", SC_CAMEL_VERSION)
print("scCAMEL import path:", Path(scm.__file__).parent)
print("Tutorial outputs:", OUTPUT_DIR)

Using installed scCAMEL: 0.47b0
scCAMEL import path: /home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL
Tutorial outputs: /mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial

[5]:

os.chdir(PROJECT_ROOT)
Path.cwd()

[5]:

PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')

[6]:

scref=anndata.read(PROJECT_ROOT / "ZeiselDentateGyrus_Ref2023-05-27.h5ad")
scref

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/__init__.py:42: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024.
  warnings.warn(

[6]:

AnnData object with n_obs × n_vars = 5454 × 14545
    obs: 'Cluster', 'Color'

[7]:

set(scref.obs["Cluster"])

[7]:

{'Astrocytes',
 'Cajal-Retzius',
 'Cck-Tox',
 'Endo',
 'GABA',
 'Granule',
 'Microglia',
 'Mossy',
 'NFOL',
 'Neuroblast',
 'OLIG',
 'OPC',
 'PVM',
 'Peri/VLMC',
 'nIPC/Rgl'}

[8]:

scref.obs.groupby(["Cluster"]).count()

/tmp/ipykernel_44346/1119954423.py:1: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  scref.obs.groupby(["Cluster"]).count()

[8]:

	Color
Cluster
Astrocytes	335
Cajal-Retzius	93
Cck-Tox	27
Endo	165
GABA	99
Granule	3045
Microglia	169
Mossy	137
NFOL	35
Neuroblast	874
OLIG	79
OPC	66
PVM	18
Peri/VLMC	59
nIPC/Rgl	253

[9]:

#if the matrix is sparse matrix
#screfall.X=screfall.X.todense()

[10]:

# Apply the function to balance clusters and label up-sampled items
scref= scm.CamelPrefiltering.balance_clusters_adata(scref, 'Cluster',5)
scref

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/anndata.py:1756: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")

[10]:

AnnData object with n_obs × n_vars = 13371 × 14545
    obs: 'Cluster', 'Color', 'upsampled'

[11]:

scref.obs.groupby(["Cluster"]).count()

/tmp/ipykernel_44346/1119954423.py:1: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  scref.obs.groupby(["Cluster"]).count()

[11]:

	Color	upsampled
Cluster
Astrocytes	944	944
Cajal-Retzius	702	702
Cck-Tox	636	636
Endo	774	774
GABA	708	708
Granule	3045	3045
Microglia	778	778
Mossy	746	746
NFOL	644	644
Neuroblast	874	874
OLIG	688	688
OPC	675	675
PVM	627	627
Peri/VLMC	668	668
nIPC/Rgl	862	862

Prefiltering_and_SelectFeatures

[12]:

# if needed scref.X=scref.X.todense()
scref.X=scref.X.todense()
scref=scm.CamelPrefiltering.DataScaling(scref)

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/storage.py:39: ImplicitModificationWarning: X should not be a np.matrix, use np.ndarray instead.
  warnings.warn(msg, ImplicitModificationWarning)

[13]:

dfpdt=pd.DataFrame(scref.X.T,index=scref.var.index,columns=scref.obs.index)
dfpdt.shape

[13]:

(14545, 13371)

[14]:

path=str(PUBLIC_DATASET) + "/"
dictfilename1="Homologene_mouse2human_dict2.pickle"
dfpdt= scm.CamelPrefiltering.TransSpeciesGeneName(dfm=dfpdt, dictfilename=dictfilename1, path=path)
samegene=set(dfpdt.index)
len(samegene)

[14]:

[15]:

dfpdt

[15]:

	10X46_1_ACTCTATGGTACGT-1	10X43_1_AGGGACGATCTCCG-1	10X46_1_GACAACTGTTGACG-1	10X46_1_CTGTAACTGGTCTA-1	10X43_1_TGCAAGTGTCTCCG-1	10X43_1_GTAACGTGCCTAAG-1	10X43_1_ATCACGGAGGAGTG-1	10X43_1_AATTGTGAGTTGCA-1	10X43_1_TGGTCAGACTATTC-1	10X43_1_ACAGTCGAGGGACA-1	...	10X43_1_CTTTCAGAAACCAC-1	10X43_1_ATAACCCTCCTCAC-1	10X43_1_GCAGGCACACCCAA-1	10X43_1_CTTTCAGAAACCAC-1	10X43_1_CAACGAACCAGTCA-1	10X46_1_TGAACCGAATGCCA-1	10X43_1_GCAGGCACACCCAA-1	10X46_1_CTCGCATGTATCTC-1	10X43_1_GTATTCACCTAGTG-1	10X46_1_AAGACAGACAGTTG-1
A2M	0.0	0.000000	0.000000	0.000000	0.000000	0.000000	0.0	0.0	0.0	0.0	...	0.00000	0.000000	0.0	0.00000	0.000000	0.0	0.0	0.0	0.000000	0.0
AAAS	0.0	0.000000	0.000000	0.000000	0.000000	0.000000	0.0	0.0	0.0	0.0	...	0.00000	0.000000	0.0	0.00000	0.000000	0.0	0.0	0.0	0.000000	0.0
AACS	0.0	0.000000	0.000000	0.000000	0.000000	0.000000	0.0	0.0	0.0	0.0	...	0.00000	0.000000	0.0	0.00000	0.000000	0.0	0.0	0.0	0.000000	0.0
AAED1	0.0	0.000000	0.000000	0.000000	0.000000	0.000000	0.0	0.0	0.0	0.0	...	0.57404	0.000000	0.0	0.57404	0.000000	0.0	0.0	0.0	0.000000	0.0
AAGAB	0.0	0.000000	0.700669	0.826938	0.000000	1.424036	0.0	0.0	0.0	0.0	...	0.00000	1.042694	0.0	0.00000	0.000000	0.0	0.0	0.0	1.374609	0.0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
ZXDC	0.0	0.000000	0.000000	0.000000	0.000000	0.000000	0.0	0.0	0.0	0.0	...	0.00000	0.000000	0.0	0.00000	0.000000	0.0	0.0	0.0	0.000000	0.0
ZYG11B	0.0	0.000000	1.401339	0.826938	1.417607	0.000000	0.0	0.0	0.0	0.0	...	0.00000	0.000000	0.0	0.00000	0.000000	0.0	0.0	0.0	0.000000	0.0
ZYX	0.0	0.000000	0.000000	0.000000	0.708804	0.000000	0.0	0.0	0.0	0.0	...	0.00000	1.042694	0.0	0.00000	1.837025	0.0	0.0	0.0	0.000000	0.0
ZZEF1	0.0	0.696231	0.000000	0.000000	0.000000	0.000000	0.0	0.0	0.0	0.0	...	0.57404	0.000000	0.0	0.57404	0.000000	0.0	0.0	0.0	0.000000	0.0
ZZZ3	0.0	0.000000	0.000000	0.000000	0.000000	0.000000	0.0	0.0	0.0	0.0	...	0.00000	0.000000	0.0	0.00000	0.000000	0.0	0.0	0.0	0.000000	0.0

12516 rows × 13371 columns

[16]:

scref2= anndata.AnnData(dfpdt.T.astype(float))
scref2

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/anndata.py:1756: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")

[16]:

AnnData object with n_obs × n_vars = 13371 × 12516

[17]:

scref2.obs=scref.obs

[18]:

scref=scref2.copy()

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/anndata.py:1756: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")

[19]:

path=str(PUBLIC_DATASET) + '/'
filename='PANTHER_cell_cycle_genes.txt'
scref= scm.CamelPrefiltering.prefilter(datax=scref,filename=filename, path=path)

CamelRunning_Prefilter......
CamelRunning_Prefilter......Finished

[20]:

scref=scm.CamelPrefiltering.DataScaling(scref)

[21]:

scref.obs

[21]:

	Cluster	Color	upsampled
10X46_1_ACTCTATGGTACGT-1	Granule	#b48c82	False
10X43_1_AGGGACGATCTCCG-1	Granule	#b48c82	False
10X46_1_GACAACTGTTGACG-1	Granule	#b48c82	False
10X46_1_CTGTAACTGGTCTA-1	Granule	#b48c82	False
10X43_1_TGCAAGTGTCTCCG-1	Granule	#b48c82	False
...	...	...	...
10X46_1_TGAACCGAATGCCA-1	PVM	#8b6564	True
10X43_1_GCAGGCACACCCAA-1	PVM	#8b6564	True
10X46_1_CTCGCATGTATCTC-1	PVM	#8b6564	True
10X43_1_GTATTCACCTAGTG-1	PVM	#8b6564	True
10X46_1_AAGACAGACAGTTG-1	PVM	#8b6564	True

13371 rows × 3 columns

[22]:

scref=scm.CamelPrefiltering.SelectFeatures(datax=scref, clustername='Cluster',methodname='Enrichment_shortcut', numbergenes=50, folderchange=1.5)

Camel...Running: clusteringValue1...
Camel...Running: clusteringValue2...
[Processing]-0%--6%--13%--20%--26%--33%--40%--46%--53%--60%--66%--73%--80%--86%--93%-Camel...Running: CrossChecking...
Camel...Running: output genelist...

[23]:

len(scref.var.index[scref.var["MVgene"]])

[23]:

[24]:

scref2=scref.copy()

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/anndata.py:1756: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")

[25]:

########################################################
########################################################
#remeber to change the file path in tftable
########################################################
scref =scm.CamelPrefiltering.LabelGene_Scaling(datax=scref2,
                                                                TPTT=100000,     mprotogruop=scref2.obs["Cluster"].values, commongene=None,
                                                                                              sharedMVgenes=None, std_scaling=True,
    tftable=str(PUBLIC_DATASET / "FantomTF2CLUSTER_human_official.txt"), learninggroup="train")

CamelRunning---GenesScaling......
CamelRunning---TrainingGenesScaling......Finished

[26]:

scref

[26]:

AnnData object with n_obs × n_vars = 13371 × 12516
    obs: 'Cluster', 'Color', 'upsampled', 'mtrain_index'
    var: 'Filter1', 'MVgene', 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names'
    obsm: 'train_set_values'

Neural-Network learning

[27]:

net=scm.CamelPrefiltering.NNclassifer(
   datax=scref,
    epochNum=150,
    learningRate=0.005,
    verbose=0,
    optimizerMmentum=0.8,
    dropout=0.3,
    #imizer__nesterov=True,
    )

CamelRunning---NNclasffier_in_cpu.......
CamelRunning---NNclasffier_in_cpu.......Finished

Accuracy plot, the overall clustering accuracy is ~95%

[28]:

ax=scm.CamelPrefiltering.AccuracyPlot( nnModel=net, accCutoff=0.95,
                 Xlow=-1, Ylow=0.0, Yhigh=1,
               )

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_37_0.png

Save and Reload the Trained Model

Note: The published scCAMEL 0.47b0 package includes CamelSwapline.save_camel_model and CamelSwapline.load_camel_model, so the tutorial uses those package helpers instead of redefining model serialization inside the notebook.

[29]:

Path.cwd()

[29]:

PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')

Tutorial note: The checkpoint files are written to OUTPUT_DIR / "camel_checkpoints2" so they stay with the rest of the tutorial outputs.

[30]:

Path.cwd()

[30]:

PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')

[31]:

# after net.fit(...)
paths = scm.CamelSwapline.save_camel_model(scref, net, out_dir=str(OUTPUT_DIR / "camel_checkpoints2"), prefix="camel_nn_mouseDG")
print("Saved files:", paths)

Saved files: {'meta': '/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial/camel_checkpoints2/camel_nn_mouseDG_meta.json', 'weights': '/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial/camel_checkpoints2/camel_nn_mouseDG_weights.pt', 'history': '/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial/camel_checkpoints2/camel_nn_mouseDG_history.json'}

[32]:

net3= scm.CamelSwapline.load_camel_model(checkpoint_dir=str(OUTPUT_DIR / "camel_checkpoints2"), prefix="camel_nn_mouseDG",dropoutVal=0.3, device="cpu")

[52]:

[33]:

ax=scm.CamelPrefiltering.AccuracyPlot( nnModel=net3, accCutoff=0.95,
                 Xlow=-1, Ylow=0.0, Yhigh=1,
               )

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_45_0.png

[ ]:

Make predition and visualization in Radar plot

[34]:

net=scm.CamelPrefiltering.NNclassifer(
   datax=scref,
    epochNum=40,
    learningRate=0.005,
    verbose=0,
    optimizerMmentum=0.8,
    dropout=0.3,
    #imizer__nesterov=True,
    )

CamelRunning---NNclasffier_in_cpu.......
CamelRunning---NNclasffier_in_cpu.......Finished

[35]:

scref

[35]:

AnnData object with n_obs × n_vars = 13371 × 12516
    obs: 'Cluster', 'Color', 'upsampled', 'mtrain_index'
    var: 'Filter1', 'MVgene', 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names'
    obsm: 'train_set_values'

[36]:

#if color is not defined: scref.obs[ 'color']
predefined_colors = pd.Series({
'Astrocytes':   [190,  10,  10],'Cajal-Retzius': [225, 160,  30],'Cck-Tox':    [217, 215,   7],
             'Endo':    [170, 180, 170], 'GABA':   [130, 140, 140],'Granule':    [180, 140, 130],
             'Microglia':  [100, 100, 240],'Mossy': [ 80, 235, 255],'NFOL':[190, 235, 255],
              'Neuroblast':[210, 255, 215],'OLIG':[230, 140, 120], 'OPC':  [255, 195,  28],
              'PVM':  [139, 101, 100],'Pericytes':  [252, 183,  26],'Radial Glia-like':   [214, 194,  39],
              'VLMC':  [255, 120, 155],'nIPC': [250, 145,  45],'hRgl2a':  [250, 125,  25],
              'hDA0':    [190, 200, 190],'hOPC':   [255,  35, 155],'hRN':     [199, 121,  41],
              'hNbGaba': [ 40,  55, 130],'hGaba':  [  7,  121, 61],'hOMTN':   [ 95, 186,  70],
              'hSert':   [ 50, 180, 180],'nIPC/Rgl':   [245, 205, 170], 'Peri/VLMC':   [185, 245, 30],
              'eSCc':[205,205,220]
})

[37]:

#if color is not defined
#del scref.obs["color"]
#scref=scm.CamelSwapline.addcolor(datax=scref,clustername="Cluster", colorcode="color")
scref = scm.CamelSwapline.add_color2(scref, clustername="Cluster", colorcode="color",predef=predefined_colors)

[38]:

#if want to define your own order
#scref.uns["mwanted_order"] =[ 'Mossy', 'Cajal-Retzius', 'Cck-Tox', 'GABA',  'Endo', 'Peri/VLMC', 'PVM', 'Microglia', 'Astrocytes', 'OLIG',
 #'NFOL', 'OPC', 'nIPC/Rgl','Neuroblast','Granule']

scref.uns["mwanted_order"] =list(sort(list(set(scref.obs["Cluster"]))))

[39]:

#radar  plot
scref=scm.CamelSwapline.prediction(datax=scref, mcolor_dict=scref.uns["refcolor_dict"] ,net=net,learninggroup="train", radarplot=True,fontsizeValue=10,
                       ncolnm=3, bbValue=(1.5, 1.55)  )
#plt.savefig("upload_%s_RadarPlot_cluster.pdf"%today,bbox_inches='tight')

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_53_0.png

[40]:

#radar  plot
scref=scm.CamelSwapline.prediction(datax=scref, mcolor_dict=scref.uns["refcolor_dict"] ,net=net3,learninggroup="train", radarplot=True,fontsizeValue=10,
                       ncolnm=3, bbValue=(1.5, 1.55)  )
#plt.savefig("upload_%s_RadarPlot_cluster.pdf"%today,bbox_inches='tight')

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_54_0.png

permutation control

[41]:

## the whole data matrix is randomized, the red X represents 95% conficence of each cell-type

[42]:

dftest0, ratiodf=scm.CamelSwapline.permutationTest(datax=scref,net=net,num=20, plotshow=True)

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: UserWarning:
The palette list has fewer values (1) than needed (15) and will cycle, which may produce an uninterpretable plot.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `scale` parameter has been renamed and will be removed in v0.15.0. Pass `density_norm='width'` for the same effect.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `scale_hue` parameter has been replaced and will be removed in v0.15.0. Pass `common_norm=True` for the same effect.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `bw` parameter is deprecated in favor of `bw_method`/`bw_adjust`.
Setting `bw_method=0.4`, but please see docs for the new parameters
and update your code. This will become an error in seaborn v0.15.0.

  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,

<Figure size 640x480 with 0 Axes>

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_57_2.png

[43]:

dftest0, ratiodf=scm.CamelSwapline.permutationTest(datax=scref,net=net3,num=20, plotshow=True)

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: UserWarning:
The palette list has fewer values (1) than needed (15) and will cycle, which may produce an uninterpretable plot.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `scale` parameter has been renamed and will be removed in v0.15.0. Pass `density_norm='width'` for the same effect.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `scale_hue` parameter has been replaced and will be removed in v0.15.0. Pass `common_norm=True` for the same effect.
  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,
/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:1848: FutureWarning:

The `bw` parameter is deprecated in favor of `bw_method`/`bw_adjust`.
Setting `bw_method=0.4`, but please see docs for the new parameters
and update your code. This will become an error in seaborn v0.15.0.

  ax = sns.violinplot(scale="width", bw=0.4, cut=2, gridsize=100, saturation=0.9, scale_hue=False,

<Figure size 640x480 with 0 Axes>

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_58_2.png

Cell_Type Purity

[44]:

#The ratio of the purity entropy for each cluster based on their learning scores, is used as a measure of purity.
#The function returns a pandas dataframe sorted by the purity score

[45]:

dfpurity1=scm.CamelSwapline.PurityEstimationLearningScore(datax=scref, clusterlist="Cluster", elbow=False, figureplot=True)

<Figure size 640x480 with 0 Axes>

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_61_1.png

association between cell-types

[46]:

scref

[46]:

AnnData object with n_obs × n_vars = 13371 × 12516
    obs: 'Cluster', 'Color', 'upsampled', 'mtrain_index', 'color'
    var: 'Filter1', 'MVgene', 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names', 'refcolor_dict', 'mwanted_order', 'Celltype_Score_RefCellType', 'Celltype_OrderNumber'
    obsm: 'train_set_values', 'Celltype_Score', 'CelltypeScoreCoordinates'

[47]:

# the heatmap of hierarchical clustering represents the cell-type similarity or association
#color from dark purple to light yellow represents the association from low to high
#number inside of eahc square indicating the association value.

[48]:

scm.CamelSwapline.CellTypeSimilarity(datax=scref, labelnum=True,  metricvalue='correlation',methodvalue="complete")

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:2472: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  dfpb2 = dfprob.groupby(["Cluster"]).mean()

<Figure size 1500x1500 with 0 Axes>

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_65_2.png

[49]:

scm.CamelSwapline.CellTypeSimilarity(datax=scref, labelnum=False,  metricvalue='correlation',methodvalue="complete")

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:2472: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
  dfpb2 = dfprob.groupby(["Cluster"]).mean()

<Figure size 1500x1500 with 0 Axes>

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_66_2.png

[ ]:

Save data

[50]:

scref

[50]:

AnnData object with n_obs × n_vars = 13371 × 12516
    obs: 'Cluster', 'Color', 'upsampled', 'mtrain_index', 'color'
    var: 'Filter1', 'MVgene', 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names', 'refcolor_dict', 'mwanted_order', 'Celltype_Score_RefCellType', 'Celltype_OrderNumber'
    obsm: 'train_set_values', 'Celltype_Score', 'CelltypeScoreCoordinates'

[51]:

Path.cwd()

[51]:

PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')

[52]:

scref.uns["refcolor_dict"]= predefined_colors

[53]:

work_dir=str(OUTPUT_DIR)
QueryName="ZeiselMouseDG"
TrainingName="ZeiselMouseDG"
filename="%s_%s_Ref%s_MergeCluster.h5ad"%(QueryName,TrainingName,today)

[54]:

os.path.join(work_dir,filename)

[54]:

'/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial/ZeiselMouseDG_ZeiselMouseDG_Ref2026-06-09_MergeCluster.h5ad'

[55]:

del scref.uns['refcolor_dict']

[56]:

CamelSwapline.write_data(adatax=scref,filename=filename,filepath=work_dir)

[57]:

#if color is not defined: scref.obs[ 'color']
scref.uns['refcolor_dict'] = pd.Series({
'Astrocytes':   [190,  10,  10],'Cajal-Retzius': [225, 160,  30],'Cck-Tox':    [217, 215,   7],
             'Endo':    [170, 180, 170], 'GABA':   [130, 140, 140],'Granule':    [180, 140, 130],
             'Microglia':  [100, 100, 240],'Mossy': [ 80, 235, 255],'NFOL':[190, 235, 255],
              'Neuroblast':[210, 255, 215],'OLIG':[230, 140, 120], 'OPC':  [255, 195,  28],
              'PVM':  [139, 101, 100],'Pericytes':  [252, 183,  26],'Radial Glia-like':   [214, 194,  39],
              'VLMC':  [255, 120, 155],'nIPC': [250, 145,  45],'hRgl2a':  [250, 125,  25],
              'hDA0':    [190, 200, 190],'hOPC':   [255,  35, 155],'hRN':     [199, 121,  41],
              'hNbGaba': [ 40,  55, 130],'hGaba':  [  7,  121, 61],'hOMTN':   [ 95, 186,  70],
              'hSert':   [ 50, 180, 180],'nIPC/Rgl':   [245, 205, 170], 'Peri/VLMC':   [185, 245, 30],
              'eSCc':[205,205,220]
})

[ ]:

[58]:

os.chdir(PROJECT_ROOT)
Path.cwd()

[58]:

PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')

[59]:

import scanpy as sc

[60]:

scref=sc.read(OUTPUT_DIR / filename)
scref

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/anndata.py:1756: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
  utils.warn_names_duplicates("obs")

[60]:

AnnData object with n_obs × n_vars = 13371 × 12516
    obs: 'Cluster', 'Color', 'upsampled', 'mtrain_index', 'color'
    var: 'Filter1', 'MVgene', 'RefGeneList'
    uns: 'Celltype_OrderNumber', 'Celltype_Score_RefCellType', 'mclasses_names', 'mwanted_order', 'train_set_gene'
    obsm: 'CelltypeScoreCoordinates', 'Celltype_Score', 'train_set_values'

[61]:

net3= scm.CamelSwapline.load_camel_model(checkpoint_dir=str(OUTPUT_DIR / "camel_checkpoints2"), prefix="camel_nn_mouseDG",dropoutVal=0.3, device="cpu")

Prediction

Couturier2020_humanGlioblastoma

[62]:

os.chdir(PROJECT_ROOT)
Path.cwd()

[62]:

PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')

[63]:

scpdt=anndata.read(PROJECT_ROOT / "Couturier2020_DevGBM_Ref2023-05-27.h5ad")

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/__init__.py:42: FutureWarning: `anndata.read` is deprecated, use `anndata.read_h5ad` instead. `ad.read` will be removed in mid 2024.
  warnings.warn(

[64]:

set(scpdt.obs["Cluster"])

[64]:

{'Astro', 'Mesenchymal', 'Neuronal', 'Oligo', 'Progenitor', 'Unassigned'}

[65]:

scpdt

[65]:

AnnData object with n_obs × n_vars = 18475 × 33660
    obs: 'Patient', 'Cluster', 'Color'

[66]:

scpdt.var.index

[66]:

Index(['A1BG', 'A1BG-AS1', 'A1CF', 'A2M', 'A2M-AS1', 'A2ML1', 'A2ML1-AS1',
       'A2ML1-AS2', 'A3GALT2', 'A4GALT',
       ...
       'ZXDC', 'ZYG11A', 'ZYG11B', 'ZYX', 'ZZEF1', 'ZZZ3', 'bP-21264C1.2',
       'bP-2171C21.3', 'bP-2189O9.3', 'hsa-mir-1253'],
      dtype='object', length=33660)

[67]:

scpdt.X=scpdt.X.todense()

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/anndata/_core/storage.py:39: ImplicitModificationWarning: X should not be a np.matrix, use np.ndarray instead.
  warnings.warn(msg, ImplicitModificationWarning)

[68]:

scpdt2=scpdt.copy()
scpdt2=scm.CamelPrefiltering.DataScaling(scpdt2)

[69]:

########################################################
########################################################
#remeber to change the file path in tftable
########################################################
########################################################
scpdt =scm.CamelPrefiltering.MVgene_Scaling(datax=scpdt2,TPTT=0,   commongene=scref.var.index.tolist(),
                                        sharedMVgenes=scref.uns[ 'train_set_gene'].tolist(),
                                                                                            std_scaling=True,score=None, thrs=None,  mprotogruop=None,
    tftable=str(PUBLIC_DATASET / "FantomTF2CLUSTER_human_official.txt"), learninggroup="test")
scpdt.uns["mwanted_order"] =list(sort(list(set(scpdt.obs["Cluster"]))))

CamelRunning---GenesScaling......
CamelRunning---TestGenesScaling......Finished

[70]:

scpdt

[70]:

AnnData object with n_obs × n_vars = 18475 × 33660
    obs: 'Patient', 'Cluster', 'Color', 'mtrain_index'
    var: 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names', 'mwanted_order'
    obsm: 'test_set_values'

[71]:

#del scpdt.obs["color"]

[72]:

# if color is not definedi
#scpdt=scm.CamelSwapline.addcolor(datax=scpdt,clustername="Cluster", colorcode="color")

[19]:

[73]:

scpdt.uns["refcolor_dict"] = pd.Series({'Astro': [100, 100, 240], 'Neuronal':   [ 0, 86,  255],
              'Mesenchymal':  [55, 120, 55], 'Oligo': [ 255,185, 5], 'Unassigned':  [192,192,192],
             'Progenitor':    [190, 0, 0]})

[74]:

test=scm.CamelSwapline.prediction(datax=scpdt, mcolor_dict=pd.Series(scpdt.uns["refcolor_dict"]),net=net3,
                                  learninggroup="test", radarplot=True, fontsizeValue=35,
                  datarefplot=scref,ncolnm=1, bbValue=(1.1, 1.05))

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_97_0.png

[75]:

scpdt

[75]:

AnnData object with n_obs × n_vars = 18475 × 33660
    obs: 'Patient', 'Cluster', 'Color', 'mtrain_index'
    var: 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names', 'mwanted_order', 'refcolor_dict', 'Celltype_Score_RefCellType', 'Celltype_OrderNumber'
    obsm: 'test_set_values', 'Celltype_Score', 'CelltypeScoreCoordinates'

[76]:

genename=sort(list(set(scpdt.obs["Cluster"])))
name=sort(list(set(scref.obs["Cluster"])))

[77]:

dfprob=pd.DataFrame(scpdt.obsm['Celltype_Score'])
dfprob.columns=scpdt.uns['Celltype_Score_RefCellType']
dfprob.index=scpdt.obs.index
dfmk=dfprob.astype(float).join(scpdt.obs["Cluster"],how="inner").T
dfprob=CamelSwapline.CellTypeSimilarityViolinPlot(datax=scpdt, dataref=scref)

/home/huyiz/anaconda3/envs/py310/lib/python3.10/site-packages/scCAMEL/CamelSwapline.py:883: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect.
  fig.tight_layout()

../_images/Tutorials_scCAMEL_SWAPLINEv2_Tutorial_scCAMEL-SWAPLINE_mouseDentateGyrus_humanGlioblastoma_100_1.png

Save data

[78]:

Path.cwd()

[78]:

PosixPath('/mnt/e/YZstudio/OneDrive/Research/Dataset/Brain_Adult_mouse')

[79]:

scpdt

[79]:

AnnData object with n_obs × n_vars = 18475 × 33660
    obs: 'Patient', 'Cluster', 'Color', 'mtrain_index'
    var: 'RefGeneList'
    uns: 'train_set_gene', 'mclasses_names', 'mwanted_order', 'refcolor_dict', 'Celltype_Score_RefCellType', 'Celltype_OrderNumber'
    obsm: 'test_set_values', 'Celltype_Score', 'CelltypeScoreCoordinates'

[80]:

work_dir=str(OUTPUT_DIR)
QueryName="Couturier2020"
TrainingName="ZeiselMouseDG"
filename="%s_%s_Ref%s_MergeCluster.h5ad"%(QueryName,TrainingName,today)

[81]:

os.path.join(work_dir,filename)

[81]:

'/mnt/e/Loal_Temp/Vicuna_Example/scCAMEL_VICUNA_updated_20260605/outputs/swapline_pypi047b0_tutorial/Couturier2020_ZeiselMouseDG_Ref2026-06-09_MergeCluster.h5ad'

[82]:

del scpdt.uns["refcolor_dict"]

[83]:

CamelSwapline.write_data(adatax=scpdt,filename=filename,filepath=work_dir)

[ ]:

[ ]:

[ ]:

[ ]:

[ ]:

[ ]: