scPRINT use case on BPH (part 2, GN analysis)¶
In this use-case, which some of the results are presented in Figure 5 of our manuscript, we perform an extensive analysis of gene networks generated by scPRINT in our previous notebook for fibroblasts of the prostate in both normal and benign prostatic hyperplasia states.
We can look at many patterns from the gene networks such as hub genes, gene modules, and gene-gene interactions. Some of them are related to highly variable and differentially expressed genes while others can only be infered by using the gene networks.
Table of content:
- Hub genes
- Network similarity
- Network communities
- Cancer version
- Normal version
- Differential Gene - Gene connections
- Using classification
from bengrn import BenGRN
import scanpy as sc
from bengrn.base import train_classifier
from anndata.utils import make_index_unique
from grnndata import utils as grnutils
from grnndata import read_h5ad
from matplotlib import pyplot as plt
from scdataloader import utils as data_utils
import numpy as np
from pyvis import network as pnx
import networkx as nx
import scipy.sparse
import pandas as pd
import gseapy as gp
from gseapy import dotplot
%load_ext autoreload
%autoreload 2
💡 connected lamindb: jkobject/scprint
prostate_combined = sc.read_h5ad("../data/prostate_combined_o2uniqsx.h5ad")
# load the generated gene networks from the previous notebook
grn = read_h5ad("../../data/prostate_fibro_grn.h5ad")
grn_c = read_h5ad("../../data/prostate_cancer_fibro_grn.h5ad")
#remove duplicates
grn.var.symbol = make_index_unique(grn.var.symbol.astype(str))
grn_c.var.symbol = make_index_unique(grn_c.var.symbol.astype(str))
# convert gene ids to symbols
grn.var['ensembl_id'] = grn.var.index
grn.var.index = grn.var.symbol
grn_c.var['ensembl_id'] = grn_c.var.index
grn_c.var.index = grn_c.var.symbol
grn.obs.cleaned_pred_disease_ontology_term_id.value_counts(),grn_c.obs.cleaned_pred_disease_ontology_term_id.value_counts()
(cleaned_pred_disease_ontology_term_id normal 300 Name: count, dtype: int64, cleaned_pred_disease_ontology_term_id benign prostatic hyperplasia 300 Name: count, dtype: int64)
1. Hub genes¶
We will use 2 different definition of centrality to compare the hubs of the networks in both states. The first one is the degree centrality, which is the number of connections of a node. The second one is the eigenvector centrality, which is the sum of the centrality of the neighbors of a node.
For each gene we mostly interogate their meaning with genecards, e.g. https://www.genecards.org/cgi-bin/carddisp.pl?gene=CD99
# between the two networks we have ~3000 genes in common over their 4000 genes
common = set(grn_c.var.symbol) & set(grn.var.symbol)
len(common)
2881
# hub genes based on edge centrality (number of connections / strength of connections)
grn.grn.sum(0).sort_values(ascending=False).head(20)
symbol S100A6 58.131020 TGIF2-RAB5IF 51.177635 MIF 44.534096 DNAJB9 40.953262 IGFBP7 38.629910 APOD 38.003906 BRME1 37.598698 SPARCL1 37.160854 TIMP1 36.671650 DCN 34.621376 C1S 32.746254 MGP 30.496748 nan-270 29.884596 SLC25A6 29.308729 BLOC1S5-TXNDC5 28.986681 CETN1 28.746281 FTL 28.266533 AKR1C1 27.641245 FBLN1 26.828733 MT2A 25.590197 dtype: float32
# without taking in account genes that are not present in the cancer network
grn.grn.sum(0).loc[list(common)].sort_values(ascending=False).head(20)
symbol TGIF2-RAB5IF 51.177635 IGFBP7 38.629910 APOD 38.003906 BRME1 37.598698 SPARCL1 37.160854 TIMP1 36.671650 DCN 34.621376 C1S 32.746254 MGP 30.496748 BLOC1S5-TXNDC5 28.986681 AKR1C1 27.641245 FBLN1 26.828733 MT2A 25.590197 MYOC 25.428144 FAM205A 25.296257 YBX3 24.937267 CST3 24.676405 C11orf96 18.731924 TFPI 18.429214 ATP6V0C 17.695705 dtype: float32
# cancer hub genes
grn_c.grn.sum(0).sort_values(ascending=False).head(20)
symbol HSPA1A 75.300415 MT2A 57.938267 CREM 42.478115 TGIF2-RAB5IF 39.980556 HSPE1 38.607624 CALD1 36.512047 SPOCK3 35.564945 HLA-A 33.446110 SPARCL1 33.403152 RBP1 32.482758 C1S 32.037743 BRME1-1 31.879770 FABP4 31.604069 nan-99 31.555256 LUM 30.523245 EIF4A1 29.690474 ATP6V0C 29.475002 PDZD2 27.333025 DEFA1 26.979454 CTSG 25.070097 dtype: float32
# without taking in account genes not present in the normal network
grn_c.grn.sum(0).loc[list(common)].sort_values(ascending=False).head(20)
symbol HSPA1A 75.300415 MT2A 57.938267 TGIF2-RAB5IF 39.980556 SPOCK3 35.564945 HLA-A 33.446110 SPARCL1 33.403152 C1S 32.037743 nan-99 31.555256 LUM 30.523245 EIF4A1 29.690474 ATP6V0C 29.475002 DEFA1 26.979454 IGFBP7 23.982243 DCN 23.304916 MGP 22.596451 CD99 21.624668 BLOC1S5-TXNDC5 20.914230 SERPING1 20.824636 COL6A2 18.076559 THBS1 17.826540 dtype: float32
# top differential hubs
TOP = 10
(set(grn_c.grn.sum(0).sort_values(ascending=False).head(TOP).index) - set(grn.grn.sum(0).sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'HLA-A', 'HSPA1A', 'MT2A', 'SPOCK3'}
TOP = 20
(set(grn_c.grn.sum(0).sort_values(ascending=False).head(TOP).index) - set(grn.grn.sum(0).sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'ATP6V0C', 'DEFA1', 'EIF4A1', 'HLA-A', 'HSPA1A', 'LUM', 'SPOCK3', 'nan-99'}
TOP = 50
(set(grn_c.grn.sum(0).sort_values(ascending=False).head(TOP).index) - set(grn.grn.sum(0).sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'CD99',
'CPE',
'DEFA1',
'EIF4A1',
'HLA-A',
'HSPA1A',
'LGALS1',
'LUM',
'PYDC2',
'SERPING1',
'SPOCK3',
'THBS1',
'nan-191',
'nan-99'}
overall we see that both are somewhat similar in the way the networks classifies main nodes but at least around half of the top 200 nodes are different between the two datasets and half of those are not due to diffent gene being used by the networks (50)
genes preferentially given as top elements in the tumor version only:
- CD99
- CPE
- DEFA1
- EIF4A1
- HLA-A
- HSPA1A
- LGALS1
- LUM
- PYDC2
- SERPING1
- SPOCK3
- THBS1
# we now compute eigen centrality creating a sparse network by only keeping the top 20 neighbors for each gene in the network
TOP = 20
grnutils.get_centrality(grn, TOP, top_k_to_disp=0)
grnutils.get_centrality(grn_c, TOP, top_k_to_disp=0)
Top central genes: [] Top central genes: []
[]
grn_c.var.centrality.sort_values(ascending=False).head(10)
symbol HSPA1A 0.271584 MT2A 0.271584 CREM 0.271574 CD99 0.271517 NR4A3 0.271385 RBP1 0.271130 SOX4 0.264214 ATP6V0C 0.263235 HLA-A 0.232913 LUM 0.202007 Name: centrality, dtype: float64
grn_c.var.centrality.sort_values(ascending=False).head(10)
symbol HSPA1A 0.271584 MT2A 0.271584 CREM 0.271574 CD99 0.271517 NR4A3 0.271385 RBP1 0.271130 SOX4 0.264214 ATP6V0C 0.263235 HLA-A 0.232913 LUM 0.202007 Name: centrality, dtype: float64
grn.var.centrality.sort_values(ascending=False).head(10)
symbol MT2A 0.276470 SLC25A6 0.276183 CXCL8 0.275859 FTH1 0.275757 MIF 0.267968 DNAJB9 0.265635 SOX4 0.241950 RELB 0.234482 YBX3 0.216567 CST3 0.213585 Name: centrality, dtype: float64
TOP = 10
(set(grn_c.var.centrality.sort_values(ascending=False).head(TOP).index) - set(grn.var.centrality.sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'ATP6V0C', 'CD99', 'HLA-A', 'HSPA1A', 'LUM'}
TOP = 20
(set(grn_c.var.centrality.sort_values(ascending=False).head(TOP).index) - set(grn.var.centrality.sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'ATP6V0C',
'CD99',
'EIF4A1',
'HLA-A',
'HSPA1A',
'LUM',
'PAGE4',
'RYR2',
'SERPINF1'}
TOP = 50
(set(grn_c.var.centrality.sort_values(ascending=False).head(TOP).index) - set(grn.var.centrality.sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'C1R',
'CD99',
'COL6A2',
'HLA-A',
'HNRNPA0',
'HSPA1A',
'LUM',
'PAGE4',
'SERPINA3',
'SERPING1',
'SPOCK3',
'THBS1'}
it seems the networks are even more similar when comparing them at the level of the centrality of their nodes. this means that some are highly central while maybe not
2. Network similarity¶
We now look at the similarity of the two networks based on general overlap of their top K edges across their common nodes
grn.var.index = grn.var.ensembl_id
grn_c.var.index = grn_c.var.ensembl_id
overlap = list(set(grn.var.index) & set(grn_c.var.index))
K = 20
subgrn_c = grn_c.get(overlap).grn
subgrn_c = subgrn_c.apply(lambda row: row >= row.nlargest(K).min(), axis=1)
subgrn = grn.get(overlap).grn
subgrn = subgrn.apply(lambda row: row >= row.nlargest(K).min(), axis=1)
(subgrn & subgrn_c).sum(1).mean() / K
0.5055735930735931
when looking at the top 20 connection for each genes in the network we see that on average only 50% of them agree between the two networks
3. Network communities¶
Again looking at the top 20 connections for each gene in the network we use the leiden (or louvain) algorithm to find communities in both networks.
We then study the hub of these communities as well as their enrichment using enrichr and ontology databases. We will only look at communities between 20 and 200 genes
TOP = 20
grnutils.get_centrality(grn, TOP, top_k_to_disp=0)
grnutils.get_centrality(grn_c, TOP, top_k_to_disp=0)
Top central genes: [] Top central genes: []
[]
grn_c = grnutils.compute_cluster(grn_c, 1.5, use="leiden", n_iterations=10, max_comm_size=100)
grn = grnutils.compute_cluster(grn, 1.5)
grn_c = grnutils.compute_cluster(grn_c, 1.5)
3.1 Cancer version¶
# our communities
grn.var['cluster_1.5'].value_counts().head(10)
cluster_1.5 0 1620 1 1483 2 732 3 111 4 23 5 9 6 4 16 1 23 1 22 1 Name: count, dtype: int64
G = grn_c.plot_subgraph(grn_c.var[grn_c.var['cluster_1.5']=='4'].index.tolist(), only=200, interactive=False)#
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4'] Index(['PRDM6', 'MSANTD3', 'STAG3', 'TRHDE', 'TBX5', 'HEPH', 'GALNT16',
'NFATC4', 'SYNDIG1', 'HSF4', 'ZNF423', 'PLA2G4C', 'PLPPR2', 'SCN1B',
'ASPA', 'SNX15', 'BAG2', 'PTK7', 'HRH2', 'PLCL1', 'PTCH2', 'TGFB3',
'RASL11A', 'TOX2', 'CPNE5', 'GLIS2', 'HIVEP3', 'FGF13', 'ACAP3',
'DUSP26', 'CTIF', 'CAMK1', 'SLC9A5', 'FBN2', 'LRIG1', 'LIMD1', 'TRPC1',
'SLC2A13', 'ADAMTS12', 'FZD7', 'AFF2', 'CACNA1D', 'MRAS', 'ST6GALNAC6',
'FGFR4', 'PTGER1', 'PKDCC', 'SLC9B2', 'NPY1R', 'ANKRD33B', 'DCLK2',
'PURG', 'GXYLT2', 'LRRN4CL', 'CA8', 'EXOC3L1', 'BHLHE22', 'AATK',
'CMC4', 'BOLA2', 'CCR10', 'TCEAL2', 'SLC24A3', 'ROR1', 'SLC51B',
'EMID1', 'AGMO', 'CENPP', 'OPTC', 'SPOCK3', 'COL27A1', 'DACT3', 'STMN3',
'DLL1', 'SMOC1', 'COLGALT2', 'ZDBF2', 'DOK6', 'HNRNPUL2-BSCL2',
'C3orf84', 'C2orf74', 'ARHGEF25', 'NPIPB5', 'TNFSF12-TNFSF13',
'ZFP91-CNTF', 'nan-41', 'nan-81', 'nan-96', 'nan-98', 'HERC3',
'nan-116', 'nan-117'],
dtype='object', name='symbol_2')
# if you want to draw better looking networks
pnet = pnx.Network(notebook=True, cdn_resources="remote", directed=True)
pnet.from_nx(G)
#first_node = list(G.nodes)[-1]
#pnet.get_node(first_node)['color'] = "red"
pnet.save_graph("../figures/pyvis/grn_c_4.html")
# Assuming 'node_names' contains the list of gene names.
# Note: it seems with enrichr, one can only use one gene set at a time, choose accordingly
enr = gp.enrichr(gene_list=list(G.nodes),
gene_sets=['KEGG_2021_Human', 'MSigDB_Hallmark_2020', 'Reactome_2022', 'Tabula_Sapiens', 'WikiPathway_2023_Human', 'TF_Perturbations_Followed_by_Expression', 'Reactome', 'PPI_Hub_Proteins', 'OMIM_Disease', 'GO_Molecular_Function_2023'],
organism='Human', # change accordingly
#description='pathway',
#cutoff=0.08, # test dataset, use lower value for real case
background=grn_c.var.symbol.tolist(),
)
enr.res2d.head(20)
2024-07-25 18:31:53,534 [WARNING] Input library not found: Reactome. Skip
| Gene_set | Term | P-value | Adjusted P-value | Old P-value | Old adjusted P-value | Odds Ratio | Combined Score | Genes | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | GO_Molecular_Function_2023 | Solute:Potassium Antiporter Activity (GO:0022821) | 0.000523 | 0.050652 | 0 | 0 | inf | inf | SLC24A3;SLC9A5 |
| 1 | GO_Molecular_Function_2023 | Coreceptor Activity Involved In Wnt Signaling ... | 0.001547 | 0.050652 | 0 | 0 | 86.822222 | 561.890021 | PTK7;ROR1 |
| 2 | GO_Molecular_Function_2023 | Sodium Ion Transmembrane Transporter Activity ... | 0.001547 | 0.050652 | 0 | 0 | 86.822222 | 561.890021 | SLC9A5;SLC9B2 |
| 3 | GO_Molecular_Function_2023 | Calcium Ion Transmembrane Transporter Activity... | 0.001700 | 0.050652 | 0 | 0 | 16.432584 | 104.795452 | SLC24A3;TRPC1;CACNA1D |
| 4 | GO_Molecular_Function_2023 | Calcium-Dependent Phospholipid Binding (GO:000... | 0.003047 | 0.050652 | 0 | 0 | 43.400000 | 251.445478 | CPNE5;PLA2G4C |
| 5 | GO_Molecular_Function_2023 | Coreceptor Activity Involved In Wnt Signaling ... | 0.003047 | 0.050652 | 0 | 0 | 43.400000 | 251.445478 | PTK7;ROR1 |
| 6 | GO_Molecular_Function_2023 | Metal Cation:Proton Antiporter Activity (GO:00... | 0.003047 | 0.050652 | 0 | 0 | 43.400000 | 251.445478 | SLC9A5;SLC9B2 |
| 7 | GO_Molecular_Function_2023 | Sodium:Proton Antiporter Activity (GO:0015385) | 0.003047 | 0.050652 | 0 | 0 | 43.400000 | 251.445478 | SLC9A5;SLC9B2 |
| 8 | GO_Molecular_Function_2023 | Iron Ion Binding (GO:0005506) | 0.005002 | 0.070586 | 0 | 0 | 28.925926 | 153.247105 | HEPH;AGMO |
| 9 | GO_Molecular_Function_2023 | Calcium Channel Activity (GO:0005262) | 0.005307 | 0.070586 | 0 | 0 | 10.099395 | 52.907606 | SLC24A3;TRPC1;CACNA1D |
| 10 | GO_Molecular_Function_2023 | Wnt Receptor Activity (GO:0042813) | 0.007391 | 0.089364 | 0 | 0 | 21.688889 | 106.438022 | FZD7;ROR1 |
| 11 | GO_Molecular_Function_2023 | RNA Polymerase II-specific DNA-binding Transcr... | 0.010110 | 0.104284 | 0 | 0 | 7.715135 | 35.445265 | TBX5;DUSP26;DLL1 |
| 12 | GO_Molecular_Function_2023 | Phospholipase Activity (GO:0004620) | 0.010193 | 0.104284 | 0 | 0 | 17.346667 | 79.552387 | PLCL1;PLA2G4C |
| 13 | GO_Molecular_Function_2023 | Monoatomic Cation Channel Activity (GO:0005261) | 0.011601 | 0.110210 | 0 | 0 | 7.284644 | 32.465172 | SLC24A3;TRPC1;CACNA1D |
| 14 | GO_Molecular_Function_2023 | Sodium Channel Regulator Activity (GO:0017080) | 0.013389 | 0.118714 | 0 | 0 | 14.451852 | 62.335739 | FGF13;SCN1B |
| 15 | GO_Molecular_Function_2023 | RNA Polymerase II Cis-Regulatory Region Sequen... | 0.022992 | 0.139045 | 0 | 0 | 2.668376 | 10.066740 | HSF4;HIVEP3;BHLHE22;TBX5;ZNF423;NFATC4;GLIS2 |
| 16 | GO_Molecular_Function_2023 | Calcium:Sodium Antiporter Activity (GO:0005432) | 0.023000 | 0.139045 | 0 | 0 | inf | inf | SLC24A3 |
| 17 | GO_Molecular_Function_2023 | Potassium:Proton Antiporter Activity (GO:0015386) | 0.023000 | 0.139045 | 0 | 0 | inf | inf | SLC9A5 |
| 18 | GO_Molecular_Function_2023 | Xylosyltransferase Activity (GO:0042285) | 0.023000 | 0.139045 | 0 | 0 | inf | inf | GXYLT2 |
| 19 | GO_Molecular_Function_2023 | Tat Protein Binding (GO:0030957) | 0.023000 | 0.139045 | 0 | 0 | inf | inf | DLL1 |
# we make a plot of the enrichment
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0].str.split(',').str[0]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
#title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
top_term=20,
figsize=(4,10), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version.
df[self.colname].replace(
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df[self.colname].replace(
G = grn_c.plot_subgraph(grn_c.var[grn_c.var['cluster_1.5']=='5'].index.tolist(), only=80, interactive=False)#
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4'] Index(['ABCF2', 'ITIH4', 'OLFM2', 'ICAM4', 'RIGI', 'KAZALD1', 'BST1', 'PANX1',
'C9', 'FBXO2', 'ZNF410', 'ACVR1C', 'VAMP7', 'TOP2A', 'TAF4B',
'KIAA1755', 'EDA', 'YPEL4', 'ENTPD3', 'GP9', 'NLGN2', 'CLCF1', 'RPRM',
'ODF3B', 'MSC', 'NTF3', 'COL13A1', 'GP1BB', 'HSPA1A', 'TMEM158',
'nan-82', 'NBPF19', 'SCO2', 'SMIM41'],
dtype='object', name='symbol_2')
pnet = pnx.Network(notebook=True, cdn_resources="remote", directed=True)
pnet.from_nx(G)
#first_node = list(G.nodes)[-1]
#pnet.get_node(first_node)['color'] = "red"
pnet.save_graph("../figures/pyvis/grn_c_5.html")
grn_c.var.loc['nan-82'] # par of the Glycoside Hydrolase Family
uid 4giyQrh7tkXI symbol nan-82 ncbi_gene_ids biotype protein_coding description novel protein synonyms organism_id 2 public_source_id 9.0 created_by_id 1 mt False ribo False hb False organism NCBITaxon:9606 TFs False ensembl_id ENSG00000269590 centrality 0.0 cluster_1.5 5 Name: nan-82, dtype: object
enr = gp.enrichr(gene_list=list(G.nodes),
gene_sets=['KEGG_2021_Human', 'MSigDB_Hallmark_2020', 'Reactome_2022', 'Tabula_Sapiens', 'WikiPathway_2023_Human', 'TF_Perturbations_Followed_by_Expression', 'Reactome', 'PPI_Hub_Proteins', 'OMIM_Disease', 'GO_Molecular_Function_2023'],
organism='Human', # change accordingly
#description='pathway',
#cutoff=0.08, # test dataset, use lower value for real case
background=grn_c.var.symbol.tolist(),
)
enr.res2d.head(20)
2024-07-25 18:33:16,565 [WARNING] Input library not found: Reactome. Skip
| Gene_set | Term | P-value | Adjusted P-value | Old P-value | Old adjusted P-value | Odds Ratio | Combined Score | Genes | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | GO_Molecular_Function_2023 | Death Receptor Binding (GO:0005123) | 0.000416 | 0.028729 | 0 | 0 | 123.875000 | 964.236668 | EDA;NTF3 |
| 1 | GO_Molecular_Function_2023 | Purine Ribonucleoside Triphosphate Binding (GO... | 0.007050 | 0.073312 | 0 | 0 | 8.626100 | 42.739637 | ABCF2;RIGI;HSPA1A |
| 2 | GO_Molecular_Function_2023 | Receptor Ligand Activity (GO:0048018) | 0.007732 | 0.073312 | 0 | 0 | 5.742222 | 27.920829 | EDA;CLCF1;NTF3;HSPA1A |
| 3 | GO_Molecular_Function_2023 | Activin Receptor Activity (GO:0017002) | 0.008500 | 0.073312 | 0 | 0 | inf | inf | ACVR1C |
| 4 | GO_Molecular_Function_2023 | Gap Junction Hemi-Channel Activity (GO:0055077) | 0.008500 | 0.073312 | 0 | 0 | inf | inf | PANX1 |
| 5 | GO_Molecular_Function_2023 | NF-kappaB Binding (GO:0051059) | 0.008500 | 0.073312 | 0 | 0 | inf | inf | TAF4B |
| 6 | GO_Molecular_Function_2023 | Hydrolase Activity, Hydrolyzing N-glycosyl Com... | 0.008500 | 0.073312 | 0 | 0 | inf | inf | BST1 |
| 7 | GO_Molecular_Function_2023 | Ubiquitin-Protein Transferase Activator Activi... | 0.008500 | 0.073312 | 0 | 0 | inf | inf | RIGI |
| 8 | GO_Molecular_Function_2023 | DNA Binding, Bending (GO:0008301) | 0.016930 | 0.083096 | 0 | 0 | 120.151515 | 490.060134 | TOP2A |
| 9 | GO_Molecular_Function_2023 | Disordered Domain Specific Binding (GO:0097718) | 0.016930 | 0.083096 | 0 | 0 | 120.151515 | 490.060134 | HSPA1A |
| 10 | GO_Molecular_Function_2023 | UDP Phosphatase Activity (GO:0045134) | 0.016930 | 0.083096 | 0 | 0 | 120.151515 | 490.060134 | ENTPD3 |
| 11 | GO_Molecular_Function_2023 | GDP Phosphatase Activity (GO:0004382) | 0.016930 | 0.083096 | 0 | 0 | 120.151515 | 490.060134 | ENTPD3 |
| 12 | GO_Molecular_Function_2023 | Nucleoside Diphosphate Phosphatase Activity (G... | 0.016930 | 0.083096 | 0 | 0 | 120.151515 | 490.060134 | ENTPD3 |
| 13 | GO_Molecular_Function_2023 | ATP Binding (GO:0005524) | 0.017220 | 0.083096 | 0 | 0 | 11.204545 | 45.509349 | ABCF2;HSPA1A |
| 14 | GO_Molecular_Function_2023 | Ubiquitin Protein Ligase Binding (GO:0031625) | 0.020064 | 0.083096 | 0 | 0 | 10.265625 | 40.126707 | RIGI;HSPA1A |
| 15 | GO_Molecular_Function_2023 | Ubiquitin-Like Protein Ligase Binding (GO:0044... | 0.021555 | 0.083096 | 0 | 0 | 9.852500 | 37.805688 | RIGI;HSPA1A |
| 16 | GO_Molecular_Function_2023 | Growth Factor Activity (GO:0008083) | 0.023090 | 0.083096 | 0 | 0 | 9.471154 | 35.690549 | CLCF1;NTF3 |
| 17 | GO_Molecular_Function_2023 | Adenyl Ribonucleotide Binding (GO:0032559) | 0.024670 | 0.083096 | 0 | 0 | 9.118056 | 33.756535 | ABCF2;HSPA1A |
| 18 | GO_Molecular_Function_2023 | C3HC4-type RING Finger Domain Binding (GO:0055... | 0.025290 | 0.083096 | 0 | 0 | 60.060606 | 220.863728 | HSPA1A |
| 19 | GO_Molecular_Function_2023 | Ubiquitin Binding (GO:0043130) | 0.025290 | 0.083096 | 0 | 0 | 60.060606 | 220.863728 | TOP2A |
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0].str.split(',').str[0]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
#title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
top_term=20,
figsize=(4,10), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version.
df[self.colname].replace(
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df[self.colname].replace(
similar story to cluster above. this time with PTN often found in the litterature with 6 other m Studies on the role of STEAP1, OR51E2, PTN, ABCC4, NDRG2 have been relatively in-depth. For instance, STEAP1 is all-called six-transmembrane epithelial antigen of the prostate 1, which belongs to a family of metalloproteinases involved in iron and copper homeostasis and other cellular processes
has been listed with 8 other members as predictors of CAF-related gene prognostic index (CRGPI) in prostate cancer. https://www.nature.com/articles/s41598-023-36125-0#Sec11
(NDRG2, TSPAN1, PTN, APOE, OR51E2, P4HB, STEAP1 and ABCC4 were used to construct molecular subtypes and)
3.2 Normal version¶
grn.var['cluster_1.5'].value_counts().head(10)
cluster_1.5 0 1620 1 1483 2 732 3 111 4 23 5 9 6 4 16 1 23 1 22 1 Name: count, dtype: int64
G = grn.plot_subgraph(grn.var[grn.var['cluster_1.5']=='3'].index.tolist(), only=200, interactive=False)
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4'] Index(['TMEM132A', 'ADGRA2', 'AGPAT4', 'THAP3', 'DNAJC25', 'TRAM2', 'NTN1',
'STAG3', 'COL4A4', 'HSD17B14',
...
'nan-75', 'TRABD2B', 'nan-77', 'PCGF2', 'nan-107', 'nan-182', 'nan-192',
'nan-195', 'nan-239', 'nan-259'],
dtype='object', name='symbol_2', length=111)
pnet = pnx.Network(notebook=True, cdn_resources="remote", directed=True)
pnet.from_nx(G)
#first_node = list(G.nodes)[-1]
#pnet.get_node(first_node)['color'] = "red"
pnet.save_graph("../figures/pyvis/grn_3.html")
grn.var.loc['nan-259'] # par of the Glycoside Hydrolase Family
uid 62JAXxQNqtOd symbol nan-259 ncbi_gene_ids 83737 biotype protein_coding description novel protein synonyms organism_id 2 public_source_id 9.0 created_by_id 1 mt False ribo False hb False organism NCBITaxon:9606 TFs False ensembl_id ENSG00000289720 centrality 0.0 cluster_1.5 3 Name: nan-259, dtype: object
grn.get(grn.var[grn.var['cluster_1.5']=='3'].index.tolist()).grn.sum(0).sort_values(ascending=False).head(3)##.grn # selenom
symbol RNASEK 0.942102 SELENOM 0.292933 nan-259 0.256140 dtype: float32
RNASEK acidifying the internal vesicle important in endocytosis# https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547866/ RNA phosphodiester bond hydrolysis --> hydrolisis pathway in the model and RNA transcription regulation
# Assuming 'node_names' contains the list of gene names
enr = gp.enrichr(gene_list=list(G.nodes),
gene_sets=['KEGG_2021_Human', 'MSigDB_Hallmark_2020', 'Reactome_2022', 'Tabula_Sapiens', 'WikiPathway_2023_Human', 'TF_Perturbations_Followed_by_Expression', 'Reactome', 'PPI_Hub_Proteins', 'OMIM_Disease', 'GO_Molecular_Function_2023'],
organism='Human', # change accordingly
#description='pathway',
#cutoff=0.08, # test dataset, use lower value for real case
background=grn.var.symbol.tolist(),
)
enr.res2d.head(20)
2024-07-25 18:33:49,291 [WARNING] Input library not found: Reactome. Skip /home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/enrichr.py:643: FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation. self.results = pd.concat(self.results, ignore_index=True)
| Gene_set | Term | P-value | Adjusted P-value | Old P-value | Old adjusted P-value | Odds Ratio | Combined Score | Genes | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | GO_Molecular_Function_2023 | Sulfotransferase Activity (GO:0008146) | 0.007252 | 0.123369 | 0 | 0 | 20.138462 | 99.212171 | CHST7;SULT1C4 |
| 1 | GO_Molecular_Function_2023 | RNA Polymerase II Transcription Regulatory Reg... | 0.007713 | 0.123369 | 0 | 0 | 3.386005 | 16.472449 | ZNF483;HAND2;NTN1;NFATC4;GLIS2;PURG;ZNF286A |
| 2 | GO_Molecular_Function_2023 | ATPase Binding (GO:0051117) | 0.013790 | 0.123369 | 0 | 0 | 13.415385 | 57.468682 | SLC2A13;NKD2 |
| 3 | GO_Molecular_Function_2023 | Chondroitin Sulfotransferase Activity (GO:0034... | 0.016750 | 0.123369 | 0 | 0 | inf | inf | CHST7 |
| 4 | GO_Molecular_Function_2023 | Peptidyl-Proline 4-Dioxygenase Activity (GO:00... | 0.016750 | 0.123369 | 0 | 0 | inf | inf | P4HA3 |
| 5 | GO_Molecular_Function_2023 | Aryl Sulfotransferase Activity (GO:0004062) | 0.016750 | 0.123369 | 0 | 0 | inf | inf | SULT1C4 |
| 6 | GO_Molecular_Function_2023 | Histone Reader Activity (GO:0140566) | 0.016750 | 0.123369 | 0 | 0 | inf | inf | TONSL |
| 7 | GO_Molecular_Function_2023 | Hydrolase Activity, Hydrolyzing N-glycosyl Com... | 0.016750 | 0.123369 | 0 | 0 | inf | inf | BST1 |
| 8 | GO_Molecular_Function_2023 | Tat Protein Binding (GO:0030957) | 0.016750 | 0.123369 | 0 | 0 | inf | inf | DLL1 |
| 9 | GO_Molecular_Function_2023 | WW Domain Binding (GO:0050699) | 0.016750 | 0.123369 | 0 | 0 | inf | inf | ENTREP3 |
| 10 | GO_Molecular_Function_2023 | RNA Polymerase II Cis-Regulatory Region Sequen... | 0.016754 | 0.123369 | 0 | 0 | 3.180050 | 13.003634 | ZNF483;HAND2;NTN1;NFATC4;GLIS2;ZNF286A |
| 11 | GO_Molecular_Function_2023 | DNA-binding Transcription Activator Activity, ... | 0.026719 | 0.134555 | 0 | 0 | 5.074219 | 18.380719 | HAND2;INSL3;GLIS2 |
| 12 | GO_Molecular_Function_2023 | Procollagen-Proline Dioxygenase Activity (GO:0... | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | P4HA3 |
| 13 | GO_Molecular_Function_2023 | N-acetylglucosamine 6-O-sulfotransferase Activ... | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | CHST7 |
| 14 | GO_Molecular_Function_2023 | ABC-type Xenobiotic Transporter Activity (GO:0... | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | ABCC1 |
| 15 | GO_Molecular_Function_2023 | alpha-N-acetylgalactosaminide Alpha-2,6-Sialyl... | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | ST6GALNAC6 |
| 16 | GO_Molecular_Function_2023 | miRNA Binding (GO:0035198) | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | DND1 |
| 17 | GO_Molecular_Function_2023 | Myosin Heavy Chain Binding (GO:0032036) | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | NKD2 |
| 18 | GO_Molecular_Function_2023 | Estradiol 17-Beta-Dehydrogenase [NAD(P)] Activ... | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | HSD17B14 |
| 19 | GO_Molecular_Function_2023 | Solute:Proton Symporter Activity (GO:0015295) | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | SLC2A13 |
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0].str.split(',').str[0].str[:50]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
#title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
top_term=20,
figsize=(4,10), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version.
df[self.colname].replace(
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df[self.colname].replace(
G = grn.plot_subgraph(grn.var[grn.var['cluster_1.5']=='4'].index.tolist(), only=80, interactive=False)#
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4'] Index(['TRHDE', 'PGR', 'ESR1', 'RASD2', 'SYNDIG1', 'OLFM2', 'FZD10', 'BHMT2',
'DUSP26', 'FOXF2', 'ADAM33', 'ADAMTS12', 'HS3ST3A1', 'CHODL', 'FTCD',
'FNDC1', 'HSPB3', 'FOXL1', 'AGMO', 'ADAMTSL2', 'S100A6', 'AKR1B10',
'SCGB1C2'],
dtype='object', name='symbol_2')
pnet = pnx.Network(notebook=True, cdn_resources="remote", directed=True)
pnet.from_nx(G)
#first_node = list(G.nodes)[-1]
#pnet.get_node(first_node)['color'] = "red"
pnet.save_graph("../figures/pyvis/grn_4.html")
enr = gp.enrichr(gene_list=list(G.nodes),
gene_sets=['GO_Molecular_Function_2023', 'MSigDB_Hallmark_2020', 'Reactome_2022', 'Tabula_Sapiens', 'WikiPathway_2023_Human', 'TF_Perturbations_Followed_by_Expression', 'Reactome', 'PPI_Hub_Proteins', 'OMIM_Disease', 'GO_Molecular_Function_2023'],
organism='Human', # change accordingly
#description='pathway',
#cutoff=0.08, # test dataset, use lower value for real case
background=grn.var.symbol.tolist(),
)
enr.res2d.head(20)
2024-07-25 18:34:27,578 [WARNING] Input library not found: Reactome. Skip
| Gene_set | Term | P-value | Adjusted P-value | Old P-value | Old adjusted P-value | Odds Ratio | Combined Score | Genes | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | GO_Molecular_Function_2023 | RNA Polymerase II General Transcription Initia... | 0.000032 | 0.000933 | 0 | 0 | inf | inf | FOXF2;ESR1 |
| 1 | GO_Molecular_Function_2023 | Estrogen Response Element Binding (GO:0034056) | 0.000032 | 0.000933 | 0 | 0 | inf | inf | PGR;ESR1 |
| 2 | GO_Molecular_Function_2023 | General Transcription Initiation Factor Bindin... | 0.000095 | 0.001860 | 0 | 0 | 378.666667 | 3508.819727 | FOXF2;ESR1 |
| 3 | GO_Molecular_Function_2023 | Transcription Coactivator Binding (GO:0001223) | 0.000188 | 0.002780 | 0 | 0 | 189.285714 | 1623.428489 | PGR;ESR1 |
| 4 | GO_Molecular_Function_2023 | Transition Metal Ion Binding (GO:0046914) | 0.000446 | 0.005263 | 0 | 0 | 13.515099 | 104.272240 | BHMT2;AGMO;ADAM33;TRHDE |
| 5 | GO_Molecular_Function_2023 | Metalloendopeptidase Activity (GO:0004222) | 0.000551 | 0.005414 | 0 | 0 | 22.794231 | 171.058857 | ADAMTSL2;ADAM33;ADAMTS12 |
| 6 | GO_Molecular_Function_2023 | Metallopeptidase Activity (GO:0008237) | 0.000672 | 0.005666 | 0 | 0 | 21.155357 | 154.536591 | ADAMTSL2;ADAM33;ADAMTS12 |
| 7 | GO_Molecular_Function_2023 | Transcription Coregulator Binding (GO:0001221) | 0.001111 | 0.008195 | 0 | 0 | 54.013605 | 367.419603 | PGR;ESR1 |
| 8 | GO_Molecular_Function_2023 | DNA-binding Transcription Activator Activity, ... | 0.001429 | 0.009365 | 0 | 0 | 15.972973 | 104.640056 | FOXF2;PGR;ESR1 |
| 9 | GO_Molecular_Function_2023 | ATPase Binding (GO:0051117) | 0.002016 | 0.011893 | 0 | 0 | 37.780952 | 234.495632 | PGR;ESR1 |
| 10 | GO_Molecular_Function_2023 | Zinc Ion Binding (GO:0008270) | 0.002284 | 0.012250 | 0 | 0 | 13.407955 | 81.545835 | BHMT2;ADAM33;TRHDE |
| 11 | GO_Molecular_Function_2023 | Cis-Regulatory Region Sequence-Specific DNA Bi... | 0.004517 | 0.021203 | 0 | 0 | 6.945569 | 37.504924 | FOXF2;PGR;FOXL1;ESR1 |
| 12 | GO_Molecular_Function_2023 | RNA Polymerase II Cis-Regulatory Region Sequen... | 0.005523 | 0.021203 | 0 | 0 | 6.541596 | 34.009122 | FOXF2;PGR;FOXL1;ESR1 |
| 13 | GO_Molecular_Function_2023 | DNA Binding, Bending (GO:0008301) | 0.005750 | 0.021203 | 0 | 0 | inf | inf | FOXL1 |
| 14 | GO_Molecular_Function_2023 | TBP-class Protein Binding (GO:0017025) | 0.005750 | 0.021203 | 0 | 0 | inf | inf | ESR1 |
| 15 | GO_Molecular_Function_2023 | [Heparan Sulfate]-Glucosamine 3-Sulfotransfera... | 0.005750 | 0.021203 | 0 | 0 | inf | inf | HS3ST3A1 |
| 16 | GO_Molecular_Function_2023 | RNA Polymerase II Transcription Regulatory Reg... | 0.008164 | 0.028333 | 0 | 0 | 5.812950 | 27.949028 | FOXF2;PGR;FOXL1;ESR1 |
| 17 | GO_Molecular_Function_2023 | Endopeptidase Activity (GO:0004175) | 0.008882 | 0.028593 | 0 | 0 | 8.021918 | 37.893119 | ADAMTSL2;ADAM33;ADAMTS12 |
| 18 | GO_Molecular_Function_2023 | DNA Binding (GO:0003677) | 0.009208 | 0.028593 | 0 | 0 | 7.911486 | 37.086673 | FOXF2;PGR;FOXL1 |
| 19 | GO_Molecular_Function_2023 | Heparan Sulfate Sulfotransferase Activity (GO:... | 0.011468 | 0.030756 | 0 | 0 | 180.727273 | 807.520606 | HS3ST3A1 |
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0].str.split(',').str[0].str[:50]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
#title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
top_term=20,
figsize=(4,10), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version.
df[self.colname].replace(
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df[self.colname].replace(
4. Differential Gene - Gene connections¶
- look at the connections of some key TFs in here
- do GSEA over these connections
- compare the connections of these same TFs for both conditions
G = grn_c.plot_subgraph("PAGE4", only=100, max_genes=40, interactive=False)#
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#ff7f0e'] Index(['MT2A', 'HSPA1A', 'CREM', 'PTN', 'HLA-A', 'SPARCL1', 'SPOCK3', 'NR4A3',
'APOD', 'CALD1', 'MGP', 'HSPE1', 'EIF4A1', 'CD99', 'EMP1', 'FABP4',
'NAMPT', 'NNMT', 'RBP1', 'LUM', 'IGFBP7', 'THBS1', 'YBX3', 'NAF1',
'DCN', 'A2M', 'CPE', 'C1R', 'ATP6V0C', 'MATN2', 'COL6A2', 'UBE2S',
'CST3', 'RHOQ', 'UBE2C', 'C1S', 'H2AZ1', 'GPRC5A', 'HLA-C', 'LGALS1',
'PAGE4'],
dtype='object', name='symbol_2')