【原】scRNA|R版CytoTRACE v2從0開始完成單細(xì)胞分化潛能預(yù)測

生信補(bǔ)給站 2024-04-26 發(fā)布于北京

展開全文

CytoTRACE v2 在2024.03月發(fā)表在預(yù)印本Mapping single-cell developmental potential in health and disease with interpretable deep learning。V2 使用可解釋性的AI算法來預(yù)測單細(xì)胞RNA測序數(shù)據(jù)的細(xì)胞分化潛能。除了給出從0（分化）到1（全能）的連續(xù)發(fā)育潛能度量結(jié)果外，還根據(jù)細(xì)胞的發(fā)育潛能進(jìn)行分為6類：具有廣泛分化潛能的全能(totipotent)和多能(pluripotent)干細(xì)胞，到能夠產(chǎn)生不同數(shù)量的下游細(xì)胞類型的 譜系限制性多能細(xì)胞（lineage-restricted oligopotent），多能(multipotent)和單能(unipotent)細(xì)胞，再到最終的 分化（differentiated）細(xì)胞。

相較V1的功能和理論的改進(jìn)詳見文獻(xiàn)正文，在代碼實(shí)現(xiàn)上CytoTRACE v2中拆分為了R版本和Python版本，安裝R版本的話無需配置python的環(huán)境，使用門檻大幅降低。

一載入R包，數(shù)據(jù)

1，R包安裝及解決報(bào)錯

根據(jù)https://github.com/digitalcytometry/cytotrace2?tab=readme-ov-file中的方式進(jìn)行安裝

（1）使用devtools::install_github直接安裝

devtools::install_github("digitalcytometry/cytotrace2", subdir = "cytotrace2_r") library(CytoTRACE2)
# 出現(xiàn)報(bào)錯Using github PAT from envvar GITHUB_TOKENDownloading GitHub repo digitalcytometry/cytotrace2@HEADError in utils::download.file(url, path, method = method, quiet = quiet,  :   download from 'https://api.github.com/repos/digitalcytometry/cytotrace2/tarball/HEAD' failed

（2）如果出現(xiàn)上述的報(bào)錯，這時候只要將報(bào)錯內(nèi)容的“https://api.github.com/repos/digitalcytometry/cytotrace2/tarball/HEAD” 復(fù)制到網(wǎng)址搜索欄回車，就會下載一個文件tar.gz的壓縮文件，然后我們再本地安裝即可。

# 本地安裝remotes::install_local("./digitalcytometry-cytotrace2-6fe2bad.tar.gz",                       subdir = "cytotrace2_r", # 特殊的                       upgrade = F,dependencies = T)library(CytoTRACE2)library(tidyverse)library(Seurat)

注：打開tar.gz壓縮包可以看到作者分的python 和r 版本，所以這里需要使用subdir參數(shù)指定為cytotrace2_r 。

注：其他的github包出現(xiàn)類型報(bào)錯也可以使用上述方式進(jìn)行解決，一般不需要設(shè)置subdir 。

2，準(zhǔn)備單細(xì)胞數(shù)據(jù)

然后使用之前注釋過的sce.anno.RData數(shù)據(jù) ，為節(jié)省資源，每種細(xì)胞類型隨機(jī)抽取30%的數(shù)據(jù)。

load("sce.anno.RData")sce2@meta.data$CB <- rownames(sce2@meta.data)sample_CB <- sce2@meta.data %>%   group_by(celltype) %>%   sample_frac(0.3)sce3 <- subset(sce2,CB %in% sample_CB$CB) sce3# An object of class Seurat

二 CytoTRACE v2 分析

1，CytoTRACE v2 分析

該版本可以接受單細(xì)胞對象 或者 單細(xì)胞矩陣的兩種形式，物種可以是人或者小鼠（默認(rèn)）。本推文是使用人的單細(xì)胞對象（sce3）進(jìn)行cytotrace2分析的示例。

#######輸入seurat 對象###########cytotrace2_result_sce <- cytotrace2(sce3,                                 is_seurat = TRUE,                                 slot_type = "counts",                                 species = 'human',                                seed = 1234)cytotrace2_result_sce
An object of class Seurat 51911 features across 4202 samples within 1 assay Active assay: RNA (51911 features, 2000 variable features) 4 dimensional reductions calculated: pca, umap, tsne, harmony

輸入的是單細(xì)胞對象，得到的也是單細(xì)胞對象，且meta信息中包含了相關(guān)score的結(jié)果。

其中CytoTRACE2_Relative為score的具體數(shù)值結(jié)果；CytoTRACE2_Potency為文章開頭提到的的六類結(jié)果。

注1：cytotrace2默認(rèn)的是小鼠，所以需要指定species = 'human' ；如果是單細(xì)胞對象的話需要指定is_seurat = TRUE ；指定seed 方便后續(xù)的結(jié)果復(fù)現(xiàn)。。

2，CytoTRACE v2可視化

（1）v2在 plotData

同cytotrace v1的可視化函數(shù)不一樣，v2在 plotData函數(shù)中包裝了一些常見的可視化結(jié)果，可以先設(shè)定待展示的表型（celltype） 。

# making an annotation dataframe that matches input requirements for plotData functionannotation <- data.frame(phenotype = sce3@meta.data$celltype) %>%   set_rownames(., colnames(sce3))
# plottingplots <- plotData(cytotrace2_result = cytotrace2_result_sce,                   annotation = annotation,                   is_seurat = TRUE)# 繪制CytoTRACE2_Potency的umap圖p1 <- plots$CytoTRACE2_UMAP# 繪制CytoTRACE2_Potency的umap圖p2 <- plots$CytoTRACE2_Potency_UMAP# 繪制CytoTRACE2_Relative的umap圖 ，v1 p3 <- plots$CytoTRACE2_Relative_UMAP # 繪制各細(xì)胞類型CytoTRACE2_Score的箱線圖p4 <- plots$CytoTRACE2_Boxplot_byPheno
(p1+p2+p3+p4) + plot_layout(ncol = 2)

（2）調(diào)整出圖的風(fēng)格，與V1接近（plotData函數(shù)中的代碼）

FeaturePlot(cytotrace2_result_sce, "CytoTRACE2_Relative",pt.size = 1.5) +   scale_colour_gradientn(colours =                            (c("#9E0142", "#F46D43", "#FEE08B", "#E6F598",                                       "#66C2A5", "#5E4FA2")),                          na.value = "transparent",                          limits = c(0, 1),                          breaks = seq(0, 1, by = 0.2),                          labels = c("0.0 (More diff.)",                                     "0.2", "0.4", "0.6", "0.8", "1.0 (Less diff.)"),                          name = "Relative\norder \n",                          guide = guide_colorbar(frame.colour = "black",                                                 ticks.colour = "black")) +   ggtitle("CytoTRACE 2") +   xlab("UMAP1") + ylab("UMAP2") +   theme(legend.text = element_text(size = 10),         legend.title = element_text(size = 12),         axis.text = element_text(size = 12),         axis.title = element_text(size = 12),         plot.title = element_text(size = 12,                                   face = "bold", hjust = 0.5,                                   margin = margin(b = 20))) +   theme(aspect.ratio = 1)

單細(xì)胞的很多可視化都是可以使用ggplot2進(jìn)行自定義的。更多ggplot2 的調(diào)整可以參考ggplot2 | 關(guān)于標(biāo)題，坐標(biāo)軸和圖例的細(xì)節(jié)修改，你可能想了解,ggplot2|詳解八大基本繪圖要素,ggplot2|theme主題設(shè)置，詳解繪圖優(yōu)化-“精雕細(xì)琢” 等。

（3）細(xì)胞類型-箱線圖

除了p4自帶的箱線圖，也可以根據(jù)需求自行繪制 scRNA分析|使用AddModuleScore 和 AUcell進(jìn)行基因集打分，可視化

library(ggpubr)p1 <- ggboxplot(cytotrace2_result_sce@meta.data, x="celltype", y="CytoTRACE2_Score", width = 0.6,                 color = "black",#輪廓顏色                fill="celltype",#填充                palette = "npg",                xlab = F, #不顯示x軸的標(biāo)簽                bxp.errorbar=T,#顯示誤差條                bxp.errorbar.width=0.5, #誤差條大小                size=1, #箱型圖邊線的粗細(xì)                outlier.shape=NA, #不顯示outlier                legend = "right") #圖例放右邊 ###指定組比較my_comparisons <- list(c("Epi", "un"), c("T", "un"),c("Myeloid", "un"))p1+stat_compare_means(comparisons = my_comparisons,                      method = "wilcox.test")