GEO dataset integration and immune landscape of CD    
    We constructed a combined dataset covering 279 CD samples and    224 control samples from mucosa after the removal of batch    effects (Fig.2A,B). A broadly    uncoordinated immune response is an indispensable hallmark of    CD. With the aim of revealing the immune landscape, we scored    the immune cell infiltration of CD patients and controls via    the ssGSEA method. As illustrated in Fig.2C, the infiltration of    20 immune cells in the CD group and control group was    significantly different, among which only the scores of T    helper 17 (Th17) cells were lower in CD tissues than in control    tissues. We then performed a correlation analysis of distinct    immune cells, as shown in Fig.2D. Interestingly, Th17    cells, CD56bright natural killer (NK) cells, CD56dim NK cells    and monocytes showed inverse correlations with almost all other    immune cells, whereas the other immune cells were generally    positively correlated with one another, which deserves special    attention.  
            GEO dataset combination and immune landscape of CD.            (A) PCA between datasets before removal of batch            effects. (B) PCA between integrated datasets            after removal of batch effects. (C) Infiltration            levels of 28 immune cell subtypes in CD samples and            controls. The blue bars represent controls, and the red            bars represent CD samples. *p<0.05;            **p<0.01; ***p<0.001;            ****p<0.0001. (D) Pearson            correlation analysis of distinct immune cells. The            purple squares represent positive correlations, and the            orange squares represent inverse correlations.            GEO Gene Expression Omnibus, CD Crohns            disease, PCA principal component analysis.          
    A total of 1265 DEGs, consisting of 592 upregulated and 673    downregulated genes, were identified through differential    expression analysis (Fig.3A). A list of possible    PRGs was produced from previous research (Supplementary file    1: Table    S1). Subsequently, we    intersected the 1265 DEGs with 930 PRGs via a Venn diagram;    thus, 130 DE-PRGs were identified (Fig.3B), which were further    grouped in a heatmap (Fig.3C). The overall    expression of these DE-PRGs in the CD group and control group    is shown in Supplementary file 3: Fig.    S1. We could conclude    that the vast majority of DE-PRGs were expressed at higher    levels in CD tissues than in control tissues.  
            Identification of DE-PRGs. (A) Volcano map of            the DEGs with the cutoff threshold set at |log2 (fold            change)|>1 and adj. p<0.05. The blue            dots represent downregulated DEGs, the red dots            represent upregulated DEGs, and the gray dots represent            genes with no significant difference. (B) Venn            diagram of DEGs and PRGs. Pink circle represents DEGs,            blue circle represents PRGs, and their overlapping area            represents DE-PRGs. (C) Clustered heatmap of the            top 40 DE-PRGs. Each row represents one of the top 40            DE-PRGs, and each column represents one sample, either            CD or normal. DE-PRGs differentially expressed            PANoptosis-related genes, DEGs differentially            expressed genes, PRGs PANoptosis-related genes,            CD Crohns disease.          
    We then examined the latent functions and signaling pathways of    the DE-PRGs. GO analysis revealed that these DE-PRGs were    predominantly involved in regulation of apoptotic signaling    pathway, leukocyte cellcell adhesion, regulation of    inflammatory response (biological process); membrane raft,    membrane microdomain, focal adhesion (cellular component);    ubiquitin-like protein ligase binding, ubiquitin protein ligase    binding, and phosphatase binding (molecular function)    (Supplementary file 4: Fig.    S2A). Additionally,    DE-PRGs were notably enriched in apoptosis, proteoglycans in    cancer, NOD-like receptor signaling pathway, among others,    according to the KEGG results (Supplementary file    4: Fig.    S2B). Moreover, a PPI    network analysis of the DE-PRGs was performed and a complex    network of the DE-PRGs was constructed (Supplementary file    5: Fig.    S3).  
    To screen the hub DE-PRGs, we first capitalized on three    algorithms, LASSO, SVM and RF, and discovered 20, 34 and 33    potential hub DE-PRGs, respectively (Fig.4AE). Afterward, 10    hub DE-PRGs were identified through the intersection of the    machine learning results, namely CD44, cell death inducing DFFA    like effector c (CIDEC), N-myc downstream regulated 1 (NDRG1),    nuclear mitotic apparatus protein 1 (NUMA1), proliferation and    apoptosis adaptor protein 15 (PEA15), recombination activating    1 (RAG1), S100 calcium binding protein A8 (S100A8), S100    calcium binding protein A9 (S100A9), TIMP metallopeptidase    inhibitor 1 (TIMP1) and X-box binding protein 1 (XBP1)    (Fig.4F). Next, we probed    their interactions, as shown in Fig.4G. Most hub DE-PRGs,    such as CD44, PEA15, S100A8, S100A9, TIMP1 and XBP1, were    closely interrelated. Moreover, NDRG1, NUMA1 and RAG1 generally    presented antagonistic effects on the other hub DE-PRGs.    Finally, the diagnostic value of each hub DE-PRG in predicting    CD was calculated based on our combined dataset    (Fig.4H). All 10 hub DE-PRGs    exhibited outstanding predictive performance with area under    the curve (AUC) values greater than 0.740. Notably, the AUC    reached as high as 0.871 when the 10 hub DE-PRGs were combined    (Fig.4H). In addition, we    conducted external validation on the GSE102133 and GSE207022    datasets, respectively. The results were satisfactory, with    high AUC values (Supplementary file 6: Fig.    S4).  
            Identification of the hub DE-PRGs. (A)            Cross-validations of adjusted parameter selection in            the LASSO model. Each curve corresponds to one gene.            (B) LASSO coefficient analysis. Vertical dashed            lines are plotted at the best lambda. (C) SVM            algorithm for hub gene selection. (D)            Relationship between the number of random forest trees            and error rates. (E) Ranking of the relative            importance of genes. (F) Venn diagram showing            the 10 hub DE-PRGs identified by LASSO, SVM and RF.            Pink circle represents potential hub DE-PRGs identified            by RF, blue circle represents potential hub DE-PRGs            identified by SVM, green circle represents potential            hub DE-PRGs identified by LASSO, and their overlapping            area represents the final hub DE-PRGs. (G) Chord            diagram showing the correlations between the hub            DE-PRGs. Red represents positive correlations between            different genes and green represents negative            correlations between different genes. (H) ROC            curves of the hub DE-PRGs in CD diagnosis.            DE-PRGs differentially expressed            PANoptosis-related genes, LASSO least absolute            shrinkage and selection operator, RF random            forest, SVM support vector machine, ROC            receiver operating characteristic, AUC area            under the curve, CD Crohns disease.          
    Spearman correlation analysis was carried out to determine the    interactions between the hub DE-PRGs and immune cells    (Fig.5). CD44, PEA15,    S100A8, S100A9, TIMP1 and XBP1 demonstrated noteworthy positive    correlations with the infiltration of an abundance of immune    cells, except for certain immune cells, such as monocytes and    CD56bright NK cells. In contrast, NDRG1, NUMA1, and RAG1 were    negatively associated with most types of immune cells,    excluding a few immune cells such as monocytes. In addition,    the CIDEC fell somewhere between these two extremes.  
            Spearman correlation analysis of hub DE-PRGs with            immune cells. The correlations between CD44 (A),            CIDEC (B), NDRG1 (C), NUMA1 (D),            PEA15 (E), RAG1 (F), S100A8 (G),            S100A9 (H), TIMP1 (I) and XBP1 (J)            gene expressions with immune cells, respectively. The            size of the dots represents the strength of gene            correlation with immune cells; the larger the dot, the            stronger the correlation. The color of the dots            represents the p-value; the greener the color, the            lower the p-value. p<0.05 was considered            statistically significant. DE-PRGs            differentially expressed PANoptosis-related genes.          
    The top 30 crucial genes related to CD were extracted from the    GeneCards database, and their expression levels were compared    between CD samples and normal samples (Fig.6A). We could easily    conclude that a majority of the CD-related genes (26 out of 30)    were differentially expressed, especially COL1A1, CTLA4, IL10    and NOD2. Pearson correlation analysis was subsequently    conducted to scrutinize the relationships between these    CD-related genes and the hub DE-PRGs (Fig.6B). Notably, CTLA4,    one of the most differentially expressed CD-related genes, was    significantly associated with each hub DE-PRG. COL1A1, IL10 and    NOD2 also presented varying levels of correlation with the hub    DE-PRGs. Nevertheless, there were no significant correlations    between the hub DE-PRGs and some CD-related genes, including    CYBB, IL10RA, RET and VCP.  
            Expression levels of the top 30 CD-related genes and            relationships between them and hub DE-PRGs. (A)            Boxplot of the top 30 crucial genes in relation to CD.            The blue bars represent controls, and the red bars            represent CD samples. (B) Pearson correlation            analysis between the top 30 CD-related genes and the 10            hub DE-PRGs. *p<0.05; **p<0.01;            ***p<0.001. CD Crohns disease,            DE-PRGs differentially expressed            PANoptosis-related genes.          
    Subsequently, a genemiRNA interaction network of the 10 hub    DE-PRGs consisting of 226 nodes and 338 edges was constructed    (Supplementary file 7: Fig.    S5 and Supplementary    file 8: Table    S3). Apparently,    miR-124-3p, miR-34a-5p and miR-27a-3p were most strongly    associated with the hub DE-PRGs in CD. After that, we generated    a geneTF regulatory network of the 10 hub DE-PRGs    (Supplementary file 9: Fig.    S6). The 10 hub    DE-PRGs were regulated by 35 total TFs. Among them, FOXC1 was    found to regulate as many as 7 hub DE-PRGs and S100A8 was    regulated by 13 miRNAs (Supplementary file 10: Table    S4). In addition, we    looked for available drugs that act on the hub DE-PRGs, and a    host of drugs were involved (Supplementary file 11: Fig.    S7 and Supplementary    file 12: Table    S5). Specifically, a    total of 19 drugs interacted with XBP1, 8 of which inhibited    it.  
    To distinguish different PANoptosis patterns in CD patients, we    adopted the NMF method for unsupervised clustering on the basis    of the 10 hub DE-PRGs. At k=2, the most stable and optimal    PANclusters were identified (Fig.7A). There were 101 and    178 CD samples in PANcluster A and PANcluster B, respectively.    The geometrical distance between the two clusters is shown in    Fig.7B, validating their    gene expression heterogeneity. Thereafter, a boxplot and a    heatmap were generated to compare the expression levels of the    hub DE-PRGs between PANcluster A and PANcluster B    (Fig.7C,D). Specifically,    PANcluster A was distinguished by the considerably high    expression levels of CIDEC, NDRG1, NUMA1 and RAG1, while the    other hub DE-PRGs, that is, CD44, PEA15, S100A8, S100A9, TIMP1    and XBP1, were expressed at higher levels in PANcluster B.  
            Recognition of PANclusters in CD. (A)            Unsupervised clustering matrix generated using NMF            method when k=2. (B) PCA plot showing the            distribution of PANcluster A and PANcluster B. The red            dots represent PANcluster A and the blue dots represent            PANcluster B. (C) Boxplot of the expression            levels of the hub DE-PRGs in PANcluster A and            PANcluster B. The red bars represent PANcluster A, and            the blue bars represent PANcluster B. (D)            Heatmap of the expression levels of the hub DE-PRGs in            PANcluster A and PANcluster B. Each row represents one            hub DE-PRG, and each column represents one CD sample.            PANclusters PANoptosis patterns, CD            Crohns disease, NMF nonnegative matrix            factorization, PCA principal component analysis,            DE-PRGs differentially expressed            PANoptosis-related genes.          
    GSVA was performed with the aim of shedding light on the    functional diversity patterns of the recognized PANclusters.    With regard to Hallmark pathways, increased activity of p53    pathway, androgen response and hypoxia were detected in    PANcluster A, whereas mTORC1 signaling, inflammatory response,    TNF- signaling via NF-B, IL-6/JAK/STAT3 signaling and    epithelial mesenchymal transition were increased in PANcluster    B (Supplementary file 13: Fig.    S8A). In addition,    results from the KEGG analysis suggested that PANcluster A had    hypoactive ECMreceptor interaction and endocytosis but    expressed high levels of genes associated with    cytokinecytokine receptor interaction and numerous signaling    pathways, including toll-like receptor signaling pathway and    NOD-like receptor signaling pathway (Supplementary file    13: Fig.    S8B). Concerning the    Reactome-based pathways, PANcluster A showed an increase in the    cell cycle pathway, while most pathways, such as cytokine    signaling in immune system and extracellular matrix-related    pathways, were significantly enriched in PANcluster B    (Supplementary file 13: Fig.    S8C).  
    To clarify the disparities in the immune system among the    PANclusters, we compared their immune microenvironments, as    shown in Fig.8A. Remarkably, the    enrichment scores of 26 immune cells were much greater in    PANcluster B than in PANcluster A. Consequently, CD56bright NK    cells and monocytes were the only two exceptions with higher    infiltration degrees in PANcluster A, the explanations behind    which demand further investigation. In addition, differential    gene analysis revealed 533 DEGs, including 171 upregulated and    362 downregulated genes (Fig.8B). To learn more    about the biological functions and processes linked to these    DEGs, GO and KEGG analyses were performed. The 533 DEGs were    markedly enriched in the following terms: positive regulation    of cell adhesion, leukocyte cellcell adhesion, and    extracellular matrix organization (biological process);    collagen-containing extracellular matrix, secretory granule    membrane, and basement membrane (cellular component); and    extracellular matrix structural constituent, glycosaminoglycan    binding, and integrin binding (molecular function)    (Fig.8C,D). Moreover, the    533 DEGs were principally involved in many pathways, such as    cell adhesion molecules, ECMreceptor interaction and PI3K-Akt    signaling pathway (Fig.8E).  
            Characterization of different PANclusters. (A)            Infiltration levels of 28 immune cell subtypes in            PANclusters A and B. The red bars represent PANcluster            A, and the blue bars represent PANcluster B. (B)            Volcano map of DEGs between PANclusters A and B. The            blue dots represent downregulated DEGs, the red dots            represent upregulated DEGs, and the gray dots represent            genes with no significant difference.            (C,D) Enriched items in GO analysis based            on the DEGs between PANclusters A and B. (E)            Enriched items in KEGG analysis based on the DEGs            between PANclusters A and B. Node color indicates gene            expression level; quadrilateral color indicates            z-score. PANclusters PANoptosis patterns,            DEGs differentially expressed genes, BP            biological process, CC cellular component,            MF molecular function, GO Gene Ontology,            KEGG Kyoto Encyclopedia of Genes and Genomes.          
    CD and control samples were acquired from 10 patients who were    diagnosed with CD, and their demographic and clinical    information is presented in Table 1. qRT-PCR was    subsequently conducted to determine the relative expression    levels of the 10 hub DE-PRGs (Fig.9A). As expected, the    levels of CD44, PEA15, S100A8, S100A9, TIMP1 and XBP1 increased    in CD samples compared with those in control samples; while the    opposite trend was observed for NDRG1. Moreover, there was no    significant difference in the mRNA expression levels of CIDEC,    NUMA1 or RAG1. Furthermore, we established classic TNBS and DSS    mouse models of CD and collected colon tissues to analyze the    expression levels of the hub DE-PRGs in murine colon tissues    from the TNBS, DSS and control groups (Fig.9B,C). Generally, the    results of the TNBS model were in line with expectations.    Specifically, in TNBS-induced colitis, Cd44, Numa1, S100a8,    S100a9, Timp1 and Xbp1 were more highly expressed, while Cidec    and Rag1 were less expressed. In addition, the levels of Ndrg1    and Pea15a did not significantly differ between the TNBS group    and the control group. Consistent with previous work, in the    DSS mouse model, the expression levels of Cd44, S100a8, S100a9    and Timp1 were greater in the mice with colitis; while the    expression level of Ndrg1 was lower in the mice with colitis.    In addition, no significant difference in the expression levels    of Cidec, Pea15a or Xbp1 was detected. Unexpectedly, the    expression levels of Numa1 and Rag1 in the DSS group were    different from those in the CD and TNBS colitis groups.  
            qRT-PCR validation of the hub DE-PRGs in CD patients            (A), TNBS-induced colitis model (B) and            DSS-induced colitis model (C). The blue dots            represent the normal/control tissues, and the red dots            represent the diseased tissues. qRT-PCR            quantitative real-time PCR, DE-PRGs            differentially expressed PANoptosis-related genes,            CD Crohns disease, TNBS            2,4,6-trinitrobenzene sulfonic acid, DSS dextran            sodium sulfate, GAPDH glyceraldehyde-3-phosphate            dehydrogenase.          
Continued here:
Characterization of PANoptosis-related genes in Crohn's disease by integrated bioinformatics, machine learning and ... - Nature.com