比特派最新版本|gtex数据库下载

作者: 比特派最新版本
2024-03-14 03:28:21

GTEx数据库简介(3):数据的获取 - 知乎

GTEx数据库简介(3):数据的获取 - 知乎切换模式写文章登录/注册GTEx数据库简介(3):数据的获取HuaMD医学大数据分享医学大数据知识----医学大数据及其综合分析(四)Hua+医学大数据 出品(转载请注明出处链接,翻版必究)(HuaPlusMD通过整合多种人类和动物数据库,建立了可靠的大数据库,为您提供疾病动物模型和临床大数据综合分析。链接:https://www.huaplusmd.com)前言:“大数据”概念早已出现,目前我们对(医学)大数据了解有多少呢?本平台将对医学大数据进行系统的介绍,并对大数据综合分析进行分享(每周更新)。分享的内容将主要涵盖大数据库(基因、蛋白数据库等)/生物银行介绍(UK Biobank, Finnish Biobanks, China Kadoorie Biobank, BioBank Japan, TCGA, GWAS catalog等),疾病动物模型数据库(如GeneNetwork, BXD),大数据库的综合使用(如Mendelian randomization),组学数据分析等。同时也会定期对一些医学大数据的使用进行实例分析。(分享的其他系列内容请见:https://www.huaplusmd.com/knowledge) 本期将对GTEx的数据下载和使用进行简介。GTEx的主要优势是:可以获取人类各种组织器官的基因表达。一般当我们做研究或药物开发时,往往希望药物/干预发生在特定的组织器官,降低副作用。例如,关于肥胖研究,我们往往会将研究的重点放在脂肪组织。而目前大多数数据库,并不能获取特异组织表达器官的基因表达,尤其是人类数据库,可谓非常难得。· 如何获得GTEx数据库的数据:ü 打开GTEx Portal: https://gtexportal.org/home/点击download >>Open Access Dataü 进入下载页面,如下图所示。在左侧(红框中),我们可以看到不同的分析版本,我们都可以用,但推荐使用V8 和V9。其中V9目前只提供snRNA-Seq data(单细胞核RNA测序技术)和Long Read RNASeq data(长读转录组,这个转录组主要是研究遗传变异在转录副本结构中的作用)。ü 这里重点说一下V8版的数据,如下图。V8数据主要有:1) RNAseq的BAM文件,全外显子Seq,全基因组Seq2) 基因型Calls3) OMNI SNP 阵列文件4) Affymetrix表达阵列, 等ü 注释文件(Annotations):下载红框的文件就可以,主要是介绍样本的基本信息,包括样本ID,组织器官类型,RIN,测试使用的技术。ü RNAseq数据:也是我们最常使用的数据。包括Read counts, TPM, Exon-exon junction read counts, transcript read count/TPM, Exon read counts。数据也可以分组织进行下载(有read counts 和 TPM两种数据)。ü 另外,GTEx还做了很多的QTL分析(不了解QTL的同学,请翻书到前面 eQTL, cis-eQTL, trans-eQTL介绍和获取):包括Single-Tissue cis-QTL Data,Single-Tissue trans-QTL Data,Multi-Tissue QTL Data,Single Tissue cis-RNA Editing QTL Data等等--------------end--------------—如果喜欢,快分享给你的朋友们吧—关注公众号,更多精彩内容等着你!原文链接:https://www.huaplusmd.com/knowledgeHua+医学大数据 出品 (医学大数据综合分析,HuaPlusMD坚持专业和认真)。如果您有医学大数据综合分析方面需求欢迎联系我们:https://www.huaplusmd.com/往期回顾:医学大数据及其综合分析(总纲)医学大数据及其综合分析(一)—— GEO数据库介绍 (1)医学大数据及其综合分析(一)—— GEO数据库介绍 (2)医学大数据及其综合分析(二)—— BXD小鼠数据库介绍 (1)医学大数据及其综合分析(二)—— BXD小鼠数据库/GeneNetwork介绍 (2)医学大数据及其综合分析(二)—— BXD小鼠数据库/GeneNetwork介绍 (3)医学大数据及其综合分析(二)—— BXD小鼠数据库/GeneNetwork介绍 (4)医学大数据及其综合分析(三)—— eQTLGen Consortium数据库简介(1)医学大数据及其综合分析(三)—— eQTLGen Consortium数据库简介(2)医学大数据及其综合分析(四)—— GTEx数据库简介(1)医学大数据及其综合分析(四)—— GTEx数据库简介(2)医学大数据及其综合分析(五)---- 国际原子能机构“双标水”数据库 (IAEA DLW)医学大数据及其综合分析(X)—— 实例分析1:中年发福:人体代谢率 不背此锅新冠肺炎(COVID-19)的致死率参考文献:[1] https://gtexportal.org/home/发布于 2022-12-21 10:09・IP 属地加拿大数据库数据获取​赞同 11​​3 条评论​分享​喜欢​收藏​申请

2021-04-29 TCGA与GTEX数据下载 - 简书

-04-29 TCGA与GTEX数据下载 - 简书登录注册写文章首页下载APP会员IT技术2021-04-29 TCGA与GTEX数据下载学习生信的小兔子关注赞赏支持2021-04-29 TCGA与GTEX数据下载数据库背景介绍

TCGA,全称:The cancer genome altas

官网:https://cancergenome.nih.gov/

是由National Cancer Institute ( NCI, 美国国家癌症研究所) 和 National Human Genome Research Institute (NHGRI, 国家人类基因组研究所) 合作建立的癌症研究项目,通过收集整理癌症相关的各种组学数据。The Cancer Genome Atlas (TCGA) has quantified gene expression levels in >12000 samples from >33 cancer types.

TCGA一般测两种数据 如:胃癌 一般测癌组织和癌旁组织

GTEx,全称: The Genotype-Tissue Expression (GTEx) project

首次被提出来是2013年,上百位科学家联名在Nature Genetics杂志发表的文章首次介绍了“基因型-组织表达工程”,并成立了“基因型-组织表达研究联盟”(Genotype-Tissue Expression Consortium,GTEx)以下简称“GTEx”)。The GTEx has catalogued gene expression in >9,000 samples across 53 tissues from 544 healthy individuals.

但是通常在UCSC上面下载癌症组织的测序数据

两个小问题

问题一:既然癌症组织的测序数据在TCGA,为什么不在TCGA官网下载数据,而是在UCSC上面下载呢?

1、 UCSC整合了多个癌症公共数据库的资源,所以数据整理起来很方便(如果在TCGA官网下载就很麻烦,首先是下载的时候麻烦,第二是下载之后整合数据没法。

2、TCGA是在美国的官方网站,正常情况下我们连Google都打不开,所以……………….

3、 UCSC在生信分析方面搭建了很多平台,包括以后看你会接触到的UCSC基因组浏览器等等,所以UCSC下载数据是国际认可的,发文章的时候不会有reviewer质疑。

问题二:既然TCGA有癌旁的数据,为什么还要下载GTEx的正常组织数据呢?

1、 不是所有的癌肿都有癌旁数据的,比如骨肉瘤这些就没有癌旁

2、最重要的原因是TCGA中的癌旁数据太少了,比如胃癌只有100多癌旁,癌组织样本却有400多个。

3、有时候也是为了作图漂亮

数据下载

从UCSC Xena官网下载TCGA数据库中胃癌RNA-Seq data和GTEX数据库中的正常组织RNA-Seq data。

进入UCSC官网网址:https://xenabrowser.net/

点击DATA SETS:https://xenabrowser.net/datapages/

选择GDC TCGA Stomach Cancer (STAD) (15 datasets):https://xenabrowser.net/datapages/?cohort=GDC%20TCGA%20Stomach%20Cancer%20(STAD)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443

下载TCGA数据库中胃癌RNA-Seq data

进入gene expression RNAseq中的HTSeq - FPKM (n=407) GDC Hub:

怎么分辨癌组织和癌旁组织呢?这时候需要下载临床信息

进入phenotype中的Phenotype (n=544) GDC Hub:

进入phenotype中的survival data (n=502) GDC Hub

下载GTEX数据库中的正常组织RNA-Seq data

数据的整合

####让报错变成英文 方便google

Sys.setenv(LANGUAGE = "en")

#禁止chr转成factor

options(stringsAsFactors = FALSE)

###清空环境

rm(list=ls())

##加载包 这个包是用来加快数据读取的

library(data.table)

library(dplyr)

library(tidyverse)

###读取胃癌的临床信息

stad.phe=fread("TCGA-STAD.GDC_phenotype.tsv",header = T, sep = '\t',data.table = F)

###查看一下数据类型

class(stad.phe)#"data.frame"

###读取表达谱信息 rnaseq的数据

stad.fkpm=fread("TCGA-STAD.htseq_fpkm.tsv",header = T, sep = '\t',data.table = F)

###查看列名 方便merger的合并列设置

colnames(stad.fkpm)

####读取探针信息 ,目的是为了将ensemble名字转为基因名

stad.pro=fread("gencode.v22.annotation.gene.probeMap",header = T, sep = '\t',data.table = F)

##查看一下数据

colnames(stad.pro)

###我们只要前两列进行转换

stad.pro=stad.pro[,c(1,2)]

colnames(stad.pro)

###用merge函数将探针转化的信息和表达谱信息进行合并

stad.fkpm.pro=merge(stad.pro,stad.fkpm,by.y ="Ensembl_ID",by.x = "id" )

dim(stad.fkpm.pro) # 60483 409

rownames(stad.fkpm.pro)=stad.fkpm.pro$gene #把基因名转换为行名

stad.fkpm.pro=distinct(stad.fkpm.pro,gene,.keep_all = T)#去重复

dim(stad.fkpm.pro) #去重复以后的数目 58387 409

stad.fkpm.pro <- column_to_rownames(stad.fkpm.pro,"gene")#相当于把gene列移动 转换为行名

##此时,已构建好表达矩阵

###因为tcga数据中有癌和癌旁,所以我们先根据临床信息把癌和癌旁区分一下

View(stad.phe) #临床信息中行名与表达矩阵的列名相同

x=stad.phe$submitter_id.samples

rownames(stad.phe)=stad.phe$submitter_id.samples

##通过查看临床信息,我们发现在sample_type.samples列中,Primary Tumor为癌,Solid Tissue Normal为癌旁

x2=stad.phe$sample_type.samples

table(stad.phe$sample_type.samples)#癌组织443个 癌旁组织101个

##将癌组织和癌旁组织提取出来

stad.phe.t=filter(stad.phe,sample_type.samples=="Primary Tumor")

stad.phe.n=filter(stad.phe,sample_type.samples=="Solid Tissue Normal")

#样本信息中行数为544 而表达矩阵列数为408 说明有100多个样本没有表达矩阵

#看一下临床信息与实际表达矩阵的交集

z1=intersect(rownames(stad.phe.t),colnames(stad.fkpm.pro))#肿瘤的临床信息与表达矩阵的交集 375个

z2=intersect(rownames(stad.phe.n),colnames(stad.fkpm.pro))#癌旁的临床信息与表达矩阵的交集 32个 说明很多组织测序不成功

stad.t=stad.fkpm.pro[,z1]##所有癌组织的表达矩阵

stad.n=stad.fkpm.pro[,z2]##所有癌旁组织的表达矩阵

##改变一下stad.t和stad.n的列名 方便查看

colnames(stad.n)=paste0("N",1:32)#癌旁的列名重新命名

colnames(stad.t)=paste0("T",1:375)#肿瘤的列名重新命名

stad.exp=merge(stad.n,stad.t,by.x = 0,by.y = 0)##合并

colnames(stad.exp)

stad.exp <- column_to_rownames(stad.exp,"Row.names")#58387 407

#表达矩阵的数据完成

save(stad.exp,file='stas.exp.Rdata')

library(data.table)

###读取gtex的表达矩阵,注意解压和不解压都是可以读取的

##电脑内存小的话,会出现error。。。。

memory.limit(size=100000)##我也不知道科不科学。。#60498 7863

gtex.exp=fread("gtex_RSEM_gene_fpkm",header = T, sep = '\t',data.table = F)

save(gtex.exp,file='gtex.exp.Rdata')

###

dim(gtex.exp) #60498 7863

gtex.exp[1:5,1:5]

###读取gtex的临床样本注释信息

gtex.phe=fread("GTEX_phenotype",header = T, sep = '\t',data.table = F)

##查看一下

View(gtex.phe)

###读取gtex的基因注释信息 也就是探针信息

gtex.pro=fread("probeMap_gencode.v23.annotation.gene.probemap",header = T, sep = '\t',data.table = F)

dim(gtex.pro) #60498 6

####我们比较一下v23和v22的差异 一个是60498 一个是60483

###现在我们先合并gtex的基因信息

###合并之前先看一下列名 找到共同的合并列

colnames(gtex.pro)

colnames(gtex.exp)

gtex.pro=gtex.pro[,c(1,2)]

###我们发现sample和id是共同的列

head(gtex.pro)

head(gtex.exp)

gtex.exp[1:4,1:4]

gtex.fkpm.pro=merge(gtex.pro,gtex.exp,by.y ="sample",by.x = "id" )#60498 7864

###取一波交集 目的是为了决定之后的gtex和stad合并是按照symbol还是enseble来合并

length(intersect(gtex.pro$id,stad.pro$id)) #42566

length(intersect(rownames(stad.exp),gtex.fkpm.pro$gene)) #57993

###我们可以看到 如果按照基因合并会有57793个交集 如果按照Ensembl却只有42566个,所以最后还是按照gene来合并

###现在要提取正常的胃组织的表达矩阵,我们要根据gtex的临床信息来匹配胃组织的sample

colnames(gtex.phe)

rownames(gtex.phe)=gtex.phe$Sample

table(gtex.phe$_primary_site)

colnames(gtex.phe)=c("Sample","body_site_detail (SMTSD)","primary_site","gender","patient","cohort")

colnames(gtex.phe)

table(gtex.phe$primary_site)#stomach 209

gtex.phe.s=filter(gtex.phe,primary_site=="Stomach")

x1=intersect(rownames(gtex.phe.s),colnames(gtex.fkpm.pro))

gtex.s=gtex.fkpm.pro[,c("gene",x1)]

rownames(gtex.s) <- gtex.s$gene

gtex.s1 <- distinct(gtex.s,gene,.keep_all = T)

gtex.s2 <- column_to_rownames(gtex.s1,"gene")

###我们发现一共有209个胃组织

###我们从官网可以看到gtex是按照log2(fpkm+0.001)处理的 stad是按照log2(fpkm+1),所以我们在合并之前 先要把他们的处理方式变成一样。

gtex.s3=2^gtex.s2

log2(0.001) #-9.965784

gtex.s5=log2(gtex.s3-0.001+1)

colnames(gtex.s5)=paste0("G",1:174)

###现在数据的处理方式都相同了 就有可比性了 将gtex的胃组织的数据与tcga的数据进行合并

all.data=merge(gtex.s5,stad.exp,by= 0) #57793 582

all.data <- column_to_rownames(all.data,"Row.names")

save(all.data,file = 'all.data.Rdata')

我也晕了。。。

library(limma)##去除批次效应

nromalized.data=normalizeBetweenArrays(all.data)

as.data.frame()

?normalizeBetweenArrays

##http://www.gsea-msigdb.org/gsea/index.jsp GSEA网址

###############另外一种gmt

BiocManager::install("GSEABase")

library(GSEABase)

library(clusterProfiler)

library("devtools")

install_github("GSEA-MSigDB/GSEA_R")

library(GSEA)

library(dplyr)

kegggmt2 <- read.gmt("c2.cp.kegg.v7.4.symbols.gmt")#12797 2

kegg_list = split(kegggmt2$gene, kegggmt2$term)

library(GSVA)

?gsva

#method:gsva;zscore;ssgsea(计算免疫浸润)

expr=as.matrix(expr)

kegg2 <- gsva(nromalized.data, kegg_list, kcdf="Gaussian",method = "gsva",parallel.sz=0)

#mx.diff False:正态分布(差异度不明显)( True:双峰分布(差异度更大,更明显。)

#########################自定义的基因集

gene.set=read.table("5.4.GENE.SET.txt",

header =F,sep = '\t',quote = '')

kegg.123=read.table("5.4.GENE.NAME.txt",

header =F,sep = '\t',quote = '')

gene.set1=as.matrix(gene.set)

gene.set2=t(gene.set1) #转置以后每一列代表一个通路

gmt=list() #GSCA输入需要list

for (i in 1:19) {

y=as.character(gene.set2[,i])

b=which(y=="")

gmt[[i]]=y[-b]

}

names(gmt)=kegg.123$V1

gmt=gmt[-19]

View(gmt)

getwd()

library(GSVA)

es.dif.nromalized <- gsva(nromalized.data, gmt, mx.diff=TRUE,kcdf="Poisson",parallel.sz=8)

es.max.nromalized <- gsva(nromalized.data, gmt, mx.diff=FALSE)

#把每个样本对应的通路的基因集都计算出来了

? data.frame

annotation_col = data.frame(

Tissuetype =c(rep("Stomach",174),rep("Solid Tissue Normal",32),rep("Tumor",375)),

Database =c(rep("gtex",174), rep("TCGA",407))

)

rownames(annotation_col)=colnames(es.max.nromalized)

pheatmap::pheatmap(es.max.nromalized, #热图的数据

cluster_rows = F,#行聚类

cluster_cols =F,#列聚类,可以看出样本之间的区分度

annotation_col = annotation_col,

show_colnames=F,

scale = "row", #以行来标准化,这个功能很不错

color =colorRampPalette(c("green", "black","red"))(100))

###############特定基因分析

exprSet.all.r=all.data[c("RORC", "RORB", "RORA"),]

exprSet.all.r=t(exprSet.all.r)

exprSet.all.r=as.data.frame(exprSet.all.r)

x=c(rep("GTEX",174),rep("N",32),rep("T",375))

exprSet.all.r$Type=x

exprSet.rorc=exprSet.all.r[,c(1,4)]

exprSet.rorc$Gene=rep("RORC")

colnames(exprSet.rorc)[1]="Relative Expression"

exprSet.rorb=exprSet.all.r[,c(2,4)]

exprSet.rorb$Gene=rep("RORB")

colnames(exprSet.rorb)[1]="Relative Expression"

exprSet.rora=exprSet.all.r[,c(3,4)]

exprSet.rora$Gene=rep("RORA")

colnames(exprSet.rora)[1]="Relative Expression"

x.all=rbind(exprSet.rorc,exprSet.rorb,exprSet.rora)

colnames(x.all)

library(ggsignif)

library(ggpubr)

library(ggplot2)

p <- ggboxplot(x.all, x = "Gene", y = "Relative Expression",

color = "Type", palette = "Type",

add = "Type")

p + stat_compare_means(aes(group = Type))

table(x.all$Gene)

my_comparisons <- list(c("RORA","RORB"), c("RORA","RORC"),c("RORB", "RORC"))

p +geom_signif(comparisons = my_comparisons,

step_increase = 0.2,map_signif_level = F,

test = t.test,size=0.8,textsize =4)

?geom_signif

x.c.b=cbind(exprSet.rorc,exprSet.rorb)

GSEA

GSVA

GO/KEGG

colnames(x.c.b)=c("RORC","Type","Gene", "RORB","Type","Gene" )

x.c.b=x.c.b[,c(1,4)]

library(ggplot2)

library(ggpubr)

## Loading required package: magrittr

p1 <- ggplot(data = x.c.b, mapping = aes(x = RORC, y = RORB)) +

geom_point(colour = "red", size = 2) +

geom_smooth(method = lm, colour='blue', fill='gray') #添加拟合曲线

p1

p1 + stat_cor(method = "pearson", label.x = -0.4, label.y = 0.2) #添加pearson相关系数

最后编辑于 :2021.05.21 16:27:55©著作权归作者所有,转载或内容合作请联系作者人面猴序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...沈念sama阅读 147,139评论 1赞 312死咒序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...沈念sama阅读 62,857评论 1赞 261救了他两次的神仙让他今天三更去死文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...开封第一讲书人阅读 97,860评论 0赞 216道士缉凶录:失踪的卖姜人 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...开封第一讲书人阅读 41,830评论 0赞 188港岛之恋(遗憾婚礼)正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...茶点故事阅读 49,755评论 1赞 263恶毒庶女顶嫁案:这布局不是一般人想出来的文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...开封第一讲书人阅读 39,234评论 1赞 183城市分裂传说那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...沈念sama阅读 30,783评论 2赞 281双鸳鸯连环套:你想象不到人心有多黑文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...开封第一讲书人阅读 29,542评论 0赞 175万荣杀人案实录序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...沈念sama阅读 32,937评论 0赞 221护林员之死正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...茶点故事阅读 29,646评论 2赞 225白月光启示录正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...茶点故事阅读 31,005评论 1赞 236活死人序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...沈念sama阅读 27,427评论 2赞 220日本核电站爆炸内幕正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...茶点故事阅读 31,901评论 3赞 214男人毒药:我在死后第九天来索命文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...开封第一讲书人阅读 25,734评论 0赞 9一桩弑父案,背后竟有这般阴谋文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...开封第一讲书人阅读 26,206评论 0赞 172情欲美人皮我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...沈念sama阅读 34,001评论 2赞 237代替公主和亲正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...茶点故事阅读 34,144评论 2赞 241推荐阅读更多精彩内容TCGA数据库下载:多种方法及优缺点介绍在TCGA数据库下载文件有很多种方法: 一.利用R语言下载(本文重点介绍这个) 方法1:TCGAbiolinks包...mayoneday阅读 9,641评论 3赞 31TCGA数据库下载,挖掘,Xena Browser可视化1.数据库简介: 癌症和肿瘤基因图谱 (The Cancer Genome Atlas, TCGA) 于2006年...阿酒88阅读 4,061评论 0赞 3如何下载TCGA数据/TCGA数据下载小教程1. 什么是TCGA?TCGA中有哪些数据? TCGA的全称是The Cancer Genome Atlas, 这...不想透明的小透明阅读 43,906评论 2赞 75生物信息数据库的使用superqun 原创于简书 手工目录: Q1:如何在肺癌数据库中选出k-ras突变的病人样本。 1. 使用GDC...superqun阅读 11,378评论 1赞 432019-11-28 周四 阴 今天感恩节哎,感谢一直在我身边的亲朋好友。感恩相遇!感恩不离不弃。 中午开了第一次的党会,身份的转变要...迷月闪星情阅读 10,488评论 0赞 11评论21赞5656赞57赞赞赏更

GTEx - Database Commons

GTEx - Database Commons

Database Commons a catalog of worldwide biological

databases

Search

e.g., human; SARS-CoV-2; lncRNA;

single cell;

spatial omics;

immune;

Oryza sativa;

European Bioinformatics Institute;China

Home

Search

Browse

Statistics

Curators

Help

Disclaimer

Submit

Sign in

Home

Database

Database Profile

GTEx

General information

URL:

https://www.gtexportal.org

Full name:

Genotype-Tissue Expression

Description:

GTEx established a data resource and tissue bank to study the relationship between genetic variation and gene expression in multiple human tissues. This release includes genotype data from approximately 714 donors and approximately 11688 RNA-seq samples across 53 tissue sites and 2 cell lines, with adequate power to detect Expression Quantitative Trait Loci in 48 tissues.

Year founded:

2013

Last update:

2019-7-24

Version:

v8

Accessibility:

Manual:

Accessible

Real time :

Checking...

Country/Region:

United States

Classification & Tag

Data type:

DNA

RNA

Data object:

Animal

Database category:

Expression

Genotype phenotype and variation

Major species:

Homo sapiens

Keywords:

normal tissue

tissue site

eQTL

RNA-seq

Contact information

University/Institution:

Broad Institute

Address:

9000 Rockville Pike, Bethesda, Maryland 20892

City:

Bethesda

Province/State:

Maryland

Country/Region:

United States

Contact name (PI/Team):

GTEx consortium

Contact email (PI/Helpdesk):

volpis@mail.nih.gov

Publications

29334591

GTEx project maps wide range of normal human genetic variation: A unique catalog and follow-up effort associate variation with gene expression across dozens of body tissues. [PMID: 29334591]

Abstract

Am J Med Genet A. 2018:176(2)

| 4 Citations (from Europe

PMC, 2024-03-09)

29019975

Enhancing GTEx by bridging the gaps between genotype, gene expression, and disease. [PMID: 29019975]

eGTEx Project.

Abstract

Genetic variants have been associated with myriad molecular phenotypes that provide new insight into the range of mechanisms underlying genetic traits and diseases. Identifying any particular genetic variant's cascade of effects, from molecule to individual, requires assaying multiple layers of molecular complexity. We introduce the Enhancing GTEx (eGTEx) project that extends the GTEx project to combine gene expression with additional intermediate molecular measurements on the same tissues to provide a resource for studying how genetic differences cascade through molecular phenotypes to impact human health.

Nat Genet. 2017:49(12)

| 92 Citations (from Europe

PMC, 2024-03-09)

25954001

Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. [PMID: 25954001]

GTEx Consortium.

Abstract

Understanding the functional consequences of genetic variation, and how it affects complex human disease and quantitative traits, remains a critical challenge for biomedicine. We present an analysis of RNA sequencing data from 1641 samples across 43 tissues from 175 individuals, generated as part of the pilot phase of the Genotype-Tissue Expression (GTEx) project. We describe the landscape of gene expression across tissues, catalog thousands of tissue-specific and shared regulatory expression quantitative trait loci (eQTL) variants, describe complex network relationships, and identify signals from genome-wide association studies explained by eQTLs. These findings provide a systematic understanding of the cellular and biological consequences of human genetic variation and of the heterogeneity of such effects among a diverse set of human tissues.

Science. 2015:348(6235)

| 2871 Citations (from Europe

PMC, 2024-03-09)

25954002

Human genomics. The human transcriptome across tissues and individuals. [PMID: 25954002]

Melé M, Ferreira PG, Reverter F, DeLuca DS, Monlong J, Sammeth M, Young TR, Goldmann JM, Pervouchine DD, Sullivan TJ, Johnson R, Segrè AV, Djebali S, Niarchou A, GTEx Consortium, Wright FA, Lappalainen T, Calvo M, Getz G, Dermitzakis ET, Ardlie KG, Guigó R.

Abstract

Transcriptional regulation and posttranscriptional processing underlie many cellular and organismal phenotypes. We used RNA sequence data generated by Genotype-Tissue Expression (GTEx) project to investigate the patterns of transcriptome variation across individuals and tissues. Tissues exhibit characteristic transcriptional signatures that show stability in postmortem samples. These signatures are dominated by a relatively small number of genes—which is most clearly seen in blood—though few are exclusive to a particular tissue and vary more across tissues than individuals. Genes exhibiting high interindividual expression variation include disease candidates associated with sex, ethnicity, and age. Primary transcription is the major driver of cellular specificity, with splicing playing mostly a complementary role; except for the brain, which exhibits a more divergent splicing program. Variation in splicing, despite its stochasticity, may play in contrast a comparatively greater role in defining individual phenotypes.

Science. 2015:348(6235)

| 697 Citations (from Europe

PMC, 2024-03-09)

26484571

A Novel Approach to High-Quality Postmortem Tissue Procurement: The GTEx Project. [PMID: 26484571]

Carithers LJ, Ardlie K, Barcus M, Branton PA, Britton A, Buia SA, Compton CC, DeLuca DS, Peter-Demchok J, Gelfand ET, Guan P, Korzeniewski GE, Lockhart NC, Rabiner CA, Rao AK, Robinson KL, Roche NV, Sawyer SJ, Segrè AV, Shive CE, Smith AM, Sobin LH, Undale AH, Valentino KM, Vaught J, Young TR, Moore HM, GTEx Consortium.

Abstract

The Genotype-Tissue Expression (GTEx) project, sponsored by the NIH Common Fund, was established to study the correlation between human genetic variation and tissue-specific gene expression in non-diseased individuals. A significant challenge was the collection of high-quality biospecimens for extensive genomic analyses. Here we describe how a successful infrastructure for biospecimen procurement was developed and implemented by multiple research partners to support the prospective collection, annotation, and distribution of blood, tissues, and cell lines for the GTEx project. Other research projects can follow this model and form beneficial partnerships with rapid autopsy and organ procurement organizations to collect high quality biospecimens and associated clinical data for genomic studies. Biospecimens, clinical and genomic data, and Standard Operating Procedures guiding biospecimen collection for the GTEx project are available to the research community.

Biopreserv Biobank. 2015:13(5)

| 423 Citations (from Europe

PMC, 2024-03-09)

23715323

The Genotype-Tissue Expression (GTEx) project. [PMID: 23715323]

GTEx Consortium.

Abstract

Genome-wide association studies have identified thousands of loci for common diseases, but, for the majority of these, the mechanisms underlying disease susceptibility remain unknown. Most associated variants are not correlated with protein-coding changes, suggesting that polymorphisms in regulatory regions probably contribute to many disease phenotypes. Here we describe the Genotype-Tissue Expression (GTEx) project, which will establish a resource database and associated tissue bank for the scientific community to study the relationship between genetic variation and gene expression in human tissues.

Nat Genet. 2013:45(6)

| 4159 Citations (from Europe

PMC, 2024-03-09)

Ranking

All databases:

13/6000

(99.8%)

Genotype phenotype and variation:

4/852

(99.648%)

Expression:

3/1143

(99.825%)

13

Total Rank

8,246

Citations

749.636

z-index

Community reviews

Not Rated

Data quality & quantity:

Content organization & presentation

System accessibility & reliability:

Submit a review

Word cloud

Tags

DNA

RNA

Genotype phenotype and variation

Expression

normal tissue

tissue site

eQTL

RNA-seq

Related Databases

Citing

Cited by

Record metadata

Created on: 2019-07-30

Curated by:

Lina Ma [2019-07-31]

Lina Ma [2019-07-30]

GTEx

Previous

Next

GTEx数据库 - 简书

数据库 - 简书登录注册写文章首页下载APP会员IT技术GTEx数据库Hayley笔记关注赞赏支持GTEx数据库GTEx项目对来自人体多个组合和器官的样本,同时进行了转录组测序和基因分型分析,构建了一个组织特异性的基因表达和调控的数据库:Genotype-Tissue Expression (GTEx)

1. 背景知识

一期

2015年,GTEx发布了第一个阶段性成果,一次性在Science杂志上发表三篇研究成果,该成果还被选为封面文章。GTEx的研究从175名死者身上采集到了1641个尸检样本,这些样本来自54个不同的身体部位,对几乎所有转录基因的基因表达模式进行了观察,从而够确定基因组中影响基因表达的特定区域。另外两篇文章之一从人所有组织中的基因表达谱进行了描述,证明了组织特异性的某些基因往往决定了组织特异性基因的表达调控;另一篇解释了截短的蛋白变异体如何影响组织中的基因表达。

The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans

The human transcriptome across tissues and individuals

Effect of predicted protein-truncating genetic variants on the human transcriptome

二期

在2017年,一次性在nature发表4篇研究成果,GTEx研究联盟的研究收集并研究了来自449名生前健康的人类捐献者的7000多份尸检样本,涵盖44个组织(42种不同的组织类型),包括31个实体器官组织、10个脑分区、全血、两个来自捐献者血液和皮肤的细胞系,作者利用这些样本研究基因表达在不同组织和个体中有何差异。题为“Landscape of X chromosome inactivation across human tissues”和“Dynamic landscape and regulation of RNA editing in mammals”的论文,采用GTEx数据探讨了与基因表达相关联的基因变异如何能够调节RNA编辑和X染色体失活现象。

对于所有的样本,主要进行了以下三种分析

RNA seq

通过illumina Truseq试剂盒构建polyA+文库,采用Hiseq 2000/2500进行测序,对于下机数据,采用STAR进行比对,参照选择的是gencode V19版本的gtf文件,进行了以下3个level的定量

gene-level:采用RNAseQC软件,对基因的raw count和TPM两种方式进行定量

exon-level:对exon的raw count进行定量

transcript-level:采用RSEM进行转录本水平的定量

genotype

通过WGS对样本进行分型, 采用的是GATK germline variants calling的流程,步骤如下

bwa-mem alignment

picard markduplicate

BQSR

indel realign

haplotypeCaller

eQTL

通过FastQTL软件进行cis-eQTL分析,将基因型和基因表达量进行关联。

通过官网可以查看基因表达量和eQTL分析的结果,以TP53为例,每个基因给出了以下3个层级的表达量

Isoform Expression

Exon Expression

Junction Expression

2. 数据库内容介绍和数据下载

通常是直接去 https://gtexportal.org/ 找到可以下载(在)的数据集,如下:

现在已经更新到v8了,v9是单细胞的数据

其中,对我们来说最重要的就是 表达矩阵, 可以下载图中 gene read counts 这个496M的文件,表达矩阵里面的样本ID肯定是数据库组织者自定义的,所以我们还需要找到样本ID的注释信息。

3. 数据分析

3.1 读入矩阵

GTEx<-read.table("GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_reads.gct", skip = 2, header = TRUE, sep = "\t")

save(GTEx,file = 'GTEx.Rdata')

GTEx[1:4,1:4] ##行是基因 列是样本

# Name Description GTEX.1117F.0226.SM.5GZZ7 GTEX.1117F.0426.SM.5EGHI

# 1 ENSG00000223972.5 DDX11L1 0 0

# 2 ENSG00000227232.5 WASH7P 187 109

# 3 ENSG00000278267.1 MIR6859-1 0 0

# 4 ENSG00000243485.5 MIR1302-2HG 1 0

colnames(GTEx)

3.2 读入注释信息

SAMPLE:样本名,和GTEx矩阵的列对应

SMTS: Tissue Type, area from which the tissue sample was taken.

SMTSD: Tissue Type, more specific detail of tissue type

a=read.table('GTEx_Analysis_v8_Annotations_SampleAttributesDS.txt',

header = T,sep = '\t',quote = '')

table(a$SMTS)

3.3 提取感兴趣的组织进行分析

以心脏为例

heart_gtex=GTEx[,gsub('[.]','-',colnames(GTEx)) %in% a[a$SMTS=='Heart',1]]

rownames(heart_gtex)=GTEx[,1]

dat=heart_gtex

就是把属于Heart这个组织的样本名挑选出来,在上面的表达矩阵里面取子集即可。

值得注意的是这个时候的表达矩阵基因名不是symbol,需要进行ID转换

ids=GTEx[,1:2]

head(ids)

colnames(ids)=c('probe_id','symbol')

dat=dat[ids$probe_id,]

dat[1:4,1:4]

ids$median=apply(dat,1,median)

ids=ids[order(ids$symbol,ids$median,decreasing = T),]

ids=ids[!duplicated(ids$symbol),]

dat=dat[ids$probe_id,]

rownames(dat)=ids$symbol

dat[1:4,1:4]

heart_gtex=dat

save(heart_gtex,file = 'heart_gtex_counts.Rdata')

这样就得到了正常的心脏组织样本表达矩阵,可以进行的分析。

4. 不同组织的基因表达分析

比较心、肺、血中S100A8的表达

organ_gtex=GTEx[,gsub('[.]','-',colnames(GTEx)) %in% a[a$SMTS %in% c('Heart','Blood','Lung'),1]]

rownames(organ_gtex)=GTEx[,1]

dat=organ_gtex

ids=GTEx[,1:2]

head(ids)

colnames(ids)=c('probe_id','symbol')

dat=dat[ids$probe_id,]

dat[1:4,1:4]

ids$median=apply(dat,1,median)

ids=ids[order(ids$symbol,ids$median,decreasing = T),]

ids=ids[!duplicated(ids$symbol),]

dat=dat[ids$probe_id,]

rownames(dat)=ids$symbol

dat[1:4,1:4]

organ_gtex=dat

#save(organ_gtex,file = 'organ_gtex_counts.Rdata')

b=a[a$SMTS %in% c('Heart','Blood','Bone Marrow','Lung'),c(1,6)]

c <- b[b$SAMPID %in% gsub('[.]','-',colnames(dat)),]

colnames(dat) <- gsub('[.]','-',colnames(dat))

dat <- t(dat)

dat <- as.data.frame(dat)

dat$group <- c$SMTS

library(dplyr)

d <- group_by(dat,group)

summarise(d,median=median(S100A8),n=n())

## A tibble: 3 x 3

# group median n

#

# 1 Blood 52504 929

# 2 Heart 730 861

# 3 Lung 10942. 578

参考

GTEx数据库-TCGA数据挖掘的好帮手

GTEx:基因型和基因表达量关联数据库

©著作权归作者所有,转载或内容合作请联系作者 禁止转载,如需转载请通过简信或评论联系作者。人面猴序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...沈念sama阅读 147,139评论 1赞 312死咒序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...沈念sama阅读 62,857评论 1赞 261救了他两次的神仙让他今天三更去死文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...开封第一讲书人阅读 97,860评论 0赞 216道士缉凶录:失踪的卖姜人 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...开封第一讲书人阅读 41,830评论 0赞 188港岛之恋(遗憾婚礼)正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...茶点故事阅读 49,755评论 1赞 263恶毒庶女顶嫁案:这布局不是一般人想出来的文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...开封第一讲书人阅读 39,234评论 1赞 183城市分裂传说那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...沈念sama阅读 30,783评论 2赞 281双鸳鸯连环套:你想象不到人心有多黑文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...开封第一讲书人阅读 29,542评论 0赞 175万荣杀人案实录序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...沈念sama阅读 32,937评论 0赞 221护林员之死正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...茶点故事阅读 29,646评论 2赞 225白月光启示录正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...茶点故事阅读 31,005评论 1赞 236活死人序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...沈念sama阅读 27,427评论 2赞 220日本核电站爆炸内幕正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...茶点故事阅读 31,901评论 3赞 214男人毒药:我在死后第九天来索命文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...开封第一讲书人阅读 25,734评论 0赞 9一桩弑父案,背后竟有这般阴谋文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...开封第一讲书人阅读 26,206评论 0赞 172情欲美人皮我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...沈念sama阅读 34,001评论 2赞 237代替公主和亲正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...茶点故事阅读 34,144评论 2赞 241推荐阅读更多精彩内容TCGA数据分析系列(二):数据库之GEPIA2公众号“生信小课堂” TCGA数据分析课程:生物信息学教学 所谓工欲善其事,必先利其器,从今天开始,我们来介绍TC...生信小课堂阅读 2,578评论 2赞 5GEPIA2&cBioPortal数据库介绍1.GEPIA2 GEPIA服务器已经运行了两年,为来自42个国家的约11万名用户处理了约28万份分析请求。GEP...小梦游仙境阅读 10,821评论 1赞 26Week23 — WGCNA分析+公共数据库挖掘你感兴趣的癌症第23周 2018 — 10.21-10.27 原文链接: Application of weighted gen...六六_ryx阅读 4,593评论 0赞 48单细胞入门【3】:好用不踩坑的单细胞数据库合集单细胞入门【1】:单细胞测序方法该如何选择?[https://www.jianshu.com/p/2e400f68...尐尐呅阅读 4,481评论 0赞 20circRNA相关数据库Circbase (http://cirbase.org/) 这个数据库收集了几千条在真核细胞表达的 circRN...程凉皮儿阅读 1,027评论 0赞 1评论1赞2424赞25赞赞赏更

2小时搞定TCGA+GTEx联合分析,多1分钟算我输 - 知乎

2小时搞定TCGA+GTEx联合分析,多1分钟算我输 - 知乎切换模式写文章登录/注册2小时搞定TCGA+GTEx联合分析,多1分钟算我输益加医益加医——专注做医学科研与临床技能培训视频分享传播的医学公众号需要脚本文件的点击下面附件~——TCGA+GTEx联合分析脚本文件——.docx190.7K · 百度网盘导语通常我们在挖掘TCGA数据库的时候,会发现该项目纳入的正常组织测序结果是非常少的,也就是说很多病人都不会有他的正常组织的转录组测序结果比如说乳腺癌吧,1200个左右的转录组数据,其中1100左右都是肿瘤组织的测序数据,只有区区100个左右的正常对照。这个时候我们就需要想办法加大正常组织测序样本量,既然TCGA数据库没有,我们就从其他数据库着手。这里值得大力推荐的是GTEx数据库 ,Genotype-Tissue Expression (GTEx)1 数据准备GTEx(Genotype-Tissue Expression,基因型-组织表达)数据库,研究从来自449名生前健康的人类捐赠者的7000多份尸检样本,涵盖44个组织(42个不同的组织类型),包括31个实体器官组织、10个脑分区、全血、2个来自捐赠者血液和皮肤的细胞系,作者利用这些样本研究基因表达在不同组织和个体中有何差异。数据下载直接在GTEx官网下载,网站会较难进入,我们可以在UCSC xene网站对GTEx及TCGA的数据进行下载。首先,点击Launch Xena,进入到数据下载页面,然后点击上方的DATA STES,进入到数据集页面。在数据集页面,就包括有TCGA,TARGET及GTEx等多个数据库的界面。点击GTEX,进入到GTEX的数据下载页面。需要下载FPKM文件及表型文件。以FPKM文件为例,直接点击TOIL RSEM fpkm,进入到下载页面当中,然后点击下载栏的链接,就可以开始下载。同样的,表型文件的下载方法也是一样的。TCGA的数据,在数下载页面有两个,一个是GDC TCGA,一个是TCGA,一般选择GDC TCGA进行下载。进入GDC TCGA后,界面和GTEX的类似,不过表型文件,包括两个,一个是表型文件,一个是临床数据,这两个数据在后续分析中均会用到。表达文件同样也是下载FPKM文件。数据下载完成后,就可以进行数据的整理了。首先对GTEX的数据进行ID转换,首先将下载的压缩包进行解压,然后直接用脚本进行ID转换,注释文件为human.gtf,方法和我们之前对TCGA进行ID转换类似,通过命令提示符进行脚本的运行。运行结束后,会在文件夹中新生成一个GTExSymbol的文件。即转换后的文件。由于GTEX是对所有的组织的样本进行的测序,所以我们需要提取对应的组织样本的的表达信息。样本信息可以直接从之前下载的样本文件获得。解压后打开。在site中找到对应组织,然后将选好的样本编号放到新建的TXT文档中。然后运行脚本,将我们所需要的样本的表达数据提取出来。并且给出样本的数目。样本数目需要记住,因为后期差异分析需要用到。接下来,就可以整理TCGA文件了,将TCGA的FPKM压缩包解压,然后用perl处理文件。运行完成后,会给出正常样本和肿瘤样本的数目。和GTEX不同的是,需要在perl脚本后加上需要处理的文件的名字。整理完后的TCGA文件,会将正常样本和肿瘤样本分开。然后对TCGA的数据进行ID转换,方法和之前的TCGA方法转换基本相同。准备好注释文件human.gtf及脚本GTEx.symbol.pl。然后通过命令提示符运行脚本。这个脚本的名称和之前GTEx的ID转换脚本名称相同,但是脚本内容不同,在TCGA中,不需要对FPKM进行+1处理,而GTEX数据由于原始的FPKM是没有进行+1的,所以在ID转换时,进行了FPKM+1的处理。GTEx和TCGA的数据都整理好时候,就可以对GTEX和TCGA的数据进行合并了。输入文件包括两个,一个是GTEX中提取的数据文件和TCGA转换后的文件。数据的合并是采用的R语言,修改路径后直接运行即可。运行结束后,在文件夹中会生成一个新的mere文件,即为合并后的GTEX和TCGA的合并文件。有了这个文件,就可以进行后续的差异分析等步骤了。2GTEX图形绘制因为GTEX是对人体中各个组织的表达数据,因此我们可以统计基因在每个组织中的表达量,因此我们可以绘制解剖图,箱型图等图形。首先统计每个组织中的表达情况。首先准备好GTEX的表型文件及基因表达文件。对表型文件进行整理,将表型文件中的病人ID,组织及性别复制到一个新的txt文档中。文档命名为site,因为后续脚本会识别文件名称。准备好位点文件和表达文件后,就可以运行脚本,对表达文件和位点文件进行合并了,并输出后续绘图所需的文件。因为绘制解剖图,只能针对某一个基因绘制,因此我们在合并时需要输入基因名称,这个基因一定要存在于表达文件中,并且要保持名字和表达文件中的名字一致。比如TP53。运行完成后会将男性和女性分别生存一个文件,并生存一个表达和位点的合并文件。接下来,就可以绘制解剖图和箱图了。解剖图包括两个,一个是男性的,一个女性的。修改脚本中的运行路径,直接运行即可。直接运行脚本,就可以看到TP53基因在各个组织中的表达情况了。随后,我们还可以绘制TP53在各个组织中的表达箱图。同样的,修改路径后直接运行脚本即可。3差异分析差异分析所用到的文件就是之前合并好的merge文件。这里要注意修改正常样本和肿瘤样本,其中正常样本应该是TCGA正常样本数加GTEX正常样本数。运行结束后,会和我们之前做差异分析的结果一样,会给出差异表达文件,差异基因表达值文件等等。然后我们就可以绘制常见的表达热图。热图绘制修改好路径及样本数目后,直接运行脚本即可。差异分析后,我们就可以进行生存分析,一次性将所有的差异基因的生存分析结果进行输出,首先准备生存分析所需的文件。生存文件从之前下载的TCGA生存文件下载下来就可以了。仅保存生存状态,生存时间和样本ID。然后把表头进行一下修改,把生存时间挪动到第二列。其中生存状态1表示死亡,0表示存活。将整理好的文件重新复制粘贴到新建的一个time.txt文件中。这样生存分析所需的文件都准备好了。接下来就可以进行临床数据和表达数据的合并了。然后通过命令提示符,运行GTEx.mergeExpTime.pl脚本。运行完成后,就可以获得合并后的文件。获得合并后的文件后,就可以对差异基因进行批量的生存分析了。就可以获得所有差异基因的生存曲线了,但是图片只生存生存显著性p<0.05图片,同时会生存一个survival文件,该文件就包括差异基因的生存p值。4功能分析首先进行ID转换,转换方法跟之前分享过的方法是一样的,将Genesymbol和logFC粘贴到新的txt文档中,然后运行R脚本。获得转换后的ID后,就可以进行GO和KEGG富集分析,并生存GO和KEGG的富集图。这里提供了两种图形输出的脚本,一个是输出常见的柱状图和气泡图,这两个图形采用GO和KEGG脚本即可。另外一种,则是输出GO和KEGG的圈图。全图脚本中是没有富集的脚本的,但是出图时需要富集结果,所以单独绘制圈图的时候,需要先进行GO和KEGG富集,并生成相关文件,富集脚本参考上一个主图和气泡图绘制的脚本即可。本文由公众号益加医原创,如需转载请在公众号后台回复“转载”即可。需要脚本文件的点击下面附件~——TCGA+GTEx联合分析脚本文件——.docx190.7K · 百度网盘编辑于 2020-08-10 16:23医学教育临床医学医学院​赞同 96​​49 条评论​分享​喜欢​收藏​申请

Genotype-Tissue Expression Project (GTEx)

Genotype-Tissue Expression Project (GTEx)

Skip to main content

Skip to navigation

Skip to search

Skip to slider

Skip to about

Skip to

subscription

Skip to footer

National Human Genome Research Institute

ABOUTGENOMICS

About Genomics

Introduction to Genomics

Educational

Resources

Policy

Issues in Genomics

The Human Genome

Project

RESEARCHFUNDING

RESEARCHFUNDING

Funding

Opportunities

Funded Programs & Projects

Division and Program Directors

Scientific

Program Analysts

Contact

by Research Area

News & Events

RESEARCHAT NHGRI

RESEARCHAT NHGRI

Research

Areas

Research

investigators

Research

Projects

Clinical

Research

Data

Tools & Resources

News &

Events

ABOUTHEALTH

ABOUT HEALTH

Genomics

& Medicine

Family

Health History

For

Patients & Families

For

Health Professionals

Careers & Training

Careers & Training

Jobs

at NHGRI

Training at NHGRI

Funding for Research

Training

Professional

Development Programs

NHGRI

Culture

News &Events

News & Events

News

Events

Social

Media

Broadcast Media

Video

Image

Gallery

Press Resources

AboutNHGRI

About NHGRI

Organization

NHGRI

Director

Mission & Vision

Policies & Guidance

Budget

Institute Advisors

Strategic Vision

Leadership Initiatives

Diversity, Equity, and Inclusion

Partner with NHGRI

Staff

Search

Contact

Us

Breadcrumb

Home

Research Funding

Funded Programs and Projects

Genotype-Tissue Expression Project (GTEx)

Home

Research Funding

Funded Programs and Projects

Genotype-Tissue Expression Project (GTEx)

An NIH Common Fund Project

The aim of the Genotype - Tissue Expression (GTEx) Project is to increase our understanding of how changes in our genes contribute to common human diseases, in order to improve health care for future generations.

GTEx Publishes Final Dataset (V8)

On Sept. 11, 2020, the final set of analyses from the GTEx Consortium were published in Science.  The latest GTEx data release represents the largest atlas of human gene expression and catalog of trait loci to date.

Overview

Launched by the National Institutes of Health (NIH) in September 2010 (See: NIH launches Genotype-Tissue Expression project), GTEx will create a resource that researchers can use to study how inherited changes in genes lead to common diseases. It will establish a database and a tissue bank that can be used by many researchers around the world for future studies.

GTEx researchers are studying genes in different tissues obtained from many different people. Thus every donor's generous gift of tissues and medical information to the GTEx project makes possible research that will help improve our understanding of diseases, giving hope that we will find better ways to prevent, diagnose, treat and eventually cure these diseases in the future.

In addition, the GTEx project includes a study to explore the effectiveness of the GTEx donor consent process. We hope to better understand how participating in the study might affect the attitudes, beliefs and feelings of donors and the families of deceased donors using interviews and surveys of participants and their families. This study will help ensure that the consent process and other aspects of the project effectively address the concerns and expectations of participants in the study.

GTEx is a pioneering project that uses state-of-the-art protocols for obtaining and storing a large range of organs and tissues and for testing them in the lab. These tissues and organs are collected and stored through the National Cancer Institute's cancer Human Biobank initiative on behalf of GTEx. Until now, no project has analyzed genetic variation and expression in as many tissues in such a large population as planned for GTEx.

GTEx is funded through the NIH Common Fund, which supports innovative projects involving multiple NIH Institutes. GTEx is managed by the NIH Office of the Director, in partnership with the National Human Genome Research Institute, National Institute of Mental Health, National Cancer Institute, and numerous other NIH institutes. Additional information about the NIH Common Fund can be found at http://commonfund.nih.gov.

To learn more about the science behind the GTEx project, we invite you to visit: http://commonfund.nih.gov/GTEx.

Overview

Launched by the National Institutes of Health (NIH) in September 2010 (See: NIH launches Genotype-Tissue Expression project), GTEx will create a resource that researchers can use to study how inherited changes in genes lead to common diseases. It will establish a database and a tissue bank that can be used by many researchers around the world for future studies.

GTEx researchers are studying genes in different tissues obtained from many different people. Thus every donor's generous gift of tissues and medical information to the GTEx project makes possible research that will help improve our understanding of diseases, giving hope that we will find better ways to prevent, diagnose, treat and eventually cure these diseases in the future.

In addition, the GTEx project includes a study to explore the effectiveness of the GTEx donor consent process. We hope to better understand how participating in the study might affect the attitudes, beliefs and feelings of donors and the families of deceased donors using interviews and surveys of participants and their families. This study will help ensure that the consent process and other aspects of the project effectively address the concerns and expectations of participants in the study.

GTEx is a pioneering project that uses state-of-the-art protocols for obtaining and storing a large range of organs and tissues and for testing them in the lab. These tissues and organs are collected and stored through the National Cancer Institute's cancer Human Biobank initiative on behalf of GTEx. Until now, no project has analyzed genetic variation and expression in as many tissues in such a large population as planned for GTEx.

GTEx is funded through the NIH Common Fund, which supports innovative projects involving multiple NIH Institutes. GTEx is managed by the NIH Office of the Director, in partnership with the National Human Genome Research Institute, National Institute of Mental Health, National Cancer Institute, and numerous other NIH institutes. Additional information about the NIH Common Fund can be found at http://commonfund.nih.gov.

To learn more about the science behind the GTEx project, we invite you to visit: http://commonfund.nih.gov/GTEx.

Donors

The generosity of donors and donor families make this project possible. The goal of GTEX is to increase our understanding of how changes in genes contribute to common human diseases. This knowledge will improve health care for future generations.

GTEx will create information that will be useful to many researchers, studying many different diseases. The gift of your tissue or your loved one's tissue may lead to research which could help improve treatment for many people in the future.

There are two types of donor groups that participate in the GTEx project: 1) organ and tissue donors, and 2) surgical donors.

Organ and tissue donors include individuals who have agreed to donate organs (like kidneys, heart, and liver) and/or tissues (like bone and cornea) for use as medical transplants after they died. Family members may also make the decision to give consent for organ or tissue donation after their loved one has passed on. These donors or their family members have the opportunity to indicate whether any organs or tissues ineligible for transplants may be donated to benefit research studies like GTEx. Donating to GTEx would not interfere with the use of the organ or tissues for transplantation, which takes priority. Compared to surgical donors, many more types of tissues can be obtained for research studies from organ and tissue donors. People who may not qualify to donate organs or tissue for transplants may still qualify to donate tissues to GTEx for research.

 

Surgical tissue donors include people who undergo certain kinds of surgery. If a surgery patient agrees ahead of time, tiny amounts of tissue removed during surgery, such as fat, skin, or muscle, can be donated for use in the GTEx project. Only tissue which needs to be removed for medical reasons can be donated to the GTEx project. Donating to the GTEx project will not cause any additional tissue to be removed.

 

GTEx Findings

It has been said that someone has "good genes" when they are particularly healthy, but what does that mean? How does understanding of genetics translate into better health? NIH designed the Genotype Tissue Expression (GTEx) project to start to answer this question. The project is looking at the differences in people's genes.

Genes are made up of DNA and DNA is made up of different pieces too. One of GTEx's goals is to identify the pieces of DNA that control how genes behave. These pieces of DNA are called expression quantitative trait loci or eQTLs. These eQTLs control the behavior of genes like a thermostat regulates the temperature of a home. GTEx studies found that the number of eQTLs varies from person to person and from tissue to tissue. Researchers also discovered eQTLs act in different ways. Some eQTLs may affect a set of genes in one tissue, while other eQTLs affect genes in many tissues.

The GTEx consortium has also built an eQTL web-browser (http://www.gtexportal.org/home/) to help visualize and discover new relationships between genes and the DNA that affects them. This website provides a resource for the many researchers who are exploring the human genome. Understanding how the eQTLs change gene behavior in different tissues can help us understand how diseases develop in people. This knowledge, in turn, may help us develop new therapies and treatments, improving our health overall.

Donors

The generosity of donors and donor families make this project possible. The goal of GTEX is to increase our understanding of how changes in genes contribute to common human diseases. This knowledge will improve health care for future generations.

GTEx will create information that will be useful to many researchers, studying many different diseases. The gift of your tissue or your loved one's tissue may lead to research which could help improve treatment for many people in the future.

There are two types of donor groups that participate in the GTEx project: 1) organ and tissue donors, and 2) surgical donors.

Organ and tissue donors include individuals who have agreed to donate organs (like kidneys, heart, and liver) and/or tissues (like bone and cornea) for use as medical transplants after they died. Family members may also make the decision to give consent for organ or tissue donation after their loved one has passed on. These donors or their family members have the opportunity to indicate whether any organs or tissues ineligible for transplants may be donated to benefit research studies like GTEx. Donating to GTEx would not interfere with the use of the organ or tissues for transplantation, which takes priority. Compared to surgical donors, many more types of tissues can be obtained for research studies from organ and tissue donors. People who may not qualify to donate organs or tissue for transplants may still qualify to donate tissues to GTEx for research.

 

Surgical tissue donors include people who undergo certain kinds of surgery. If a surgery patient agrees ahead of time, tiny amounts of tissue removed during surgery, such as fat, skin, or muscle, can be donated for use in the GTEx project. Only tissue which needs to be removed for medical reasons can be donated to the GTEx project. Donating to the GTEx project will not cause any additional tissue to be removed.

 

GTEx Findings

It has been said that someone has "good genes" when they are particularly healthy, but what does that mean? How does understanding of genetics translate into better health? NIH designed the Genotype Tissue Expression (GTEx) project to start to answer this question. The project is looking at the differences in people's genes.

Genes are made up of DNA and DNA is made up of different pieces too. One of GTEx's goals is to identify the pieces of DNA that control how genes behave. These pieces of DNA are called expression quantitative trait loci or eQTLs. These eQTLs control the behavior of genes like a thermostat regulates the temperature of a home. GTEx studies found that the number of eQTLs varies from person to person and from tissue to tissue. Researchers also discovered eQTLs act in different ways. Some eQTLs may affect a set of genes in one tissue, while other eQTLs affect genes in many tissues.

The GTEx consortium has also built an eQTL web-browser (http://www.gtexportal.org/home/) to help visualize and discover new relationships between genes and the DNA that affects them. This website provides a resource for the many researchers who are exploring the human genome. Understanding how the eQTLs change gene behavior in different tissues can help us understand how diseases develop in people. This knowledge, in turn, may help us develop new therapies and treatments, improving our health overall.

Progress

As of December 2015, GTEx finished enrollment of the additional donors, for a total of 961 donors. Analysis of the samples and data will continue for another 18 months. Over 30,000 samples have been collected.

In fall of 2015, information on gene expression for over 450 donors was released to the scientific community through the database of Genotype and Phenotype (dbGaP). Additionally, the new version of the GTEx Genome Browser has been launched and features new visualization tools.

In 2014, The National Institutes of Health awarded eight new grants to researchers to use tissues donated to GTEx to explore how human genes are expressed and regulated in different tissues.

In 2020, the GTEx Consortium published its final set of studies analyzing genotype data from approximately 948 post-mortem donors and approximately 17,382 RNA-seq samples across 54 tissue sites and 2 cell lines, with adequate power to detect Expression Quantitative Trait Loci in 48 tissues.

Progress

As of December 2015, GTEx finished enrollment of the additional donors, for a total of 961 donors. Analysis of the samples and data will continue for another 18 months. Over 30,000 samples have been collected.

In fall of 2015, information on gene expression for over 450 donors was released to the scientific community through the database of Genotype and Phenotype (dbGaP). Additionally, the new version of the GTEx Genome Browser has been launched and features new visualization tools.

In 2014, The National Institutes of Health awarded eight new grants to researchers to use tissues donated to GTEx to explore how human genes are expressed and regulated in different tissues.

In 2020, the GTEx Consortium published its final set of studies analyzing genotype data from approximately 948 post-mortem donors and approximately 17,382 RNA-seq samples across 54 tissue sites and 2 cell lines, with adequate power to detect Expression Quantitative Trait Loci in 48 tissues.

Social Media

Engage

GTEx Portal on Twitter

Program Staff

Simona Volpi, Ph.D.

Program Director

Division of Genomic Medicine

Related Projects

Research Funding

Developmental Genotype-Tissue Expression (dGTEx)

Current Slide

Research Funding

Developmental Genotype-Tissue Expression (dGTEx)

Current Slide

Research Funding

Developmental Genotype-Tissue Expression (dGTEx)

Last updated: September 24, 2020

Get Updates

Enter your email address to receive updates about the latest advances in genomics research.

Subscribe

Social Media Stream

Footer Links

Contact

Accessibility

Site Map

Staff Search

Plug-Ins Used by HHS

FOIA

Privacy

Copyright

HHS Vulnerability Disclosure

35. 手把手教学GTEx数据下载和整理_哔哩哔哩_bilibili

35. 手把手教学GTEx数据下载和整理_哔哩哔哩_bilibili 首页番剧直播游戏中心会员购漫画赛事投稿TCGA及GEO数据挖掘入门必看/ R语言实战/ 生信分析/数据分析/转录组/ 生物信息学

3.5万

17

2023-10-15 22:46:52

未经作者授权,禁止转载7115702905418报错,分析,套路,代做 V HuanXJJY (非诚勿扰)

代码见主页,推广橱窗,TCGA及GEO数据挖掘,有问题截屏评论或B站私信知识校园学习学习编程RRstudio

生信幻想家

发消息

报错,分析,套路,代做 V HuanXJJY (非诚勿扰) 喜欢唱跳生信,练习两年半的偶像医学生

关注 5533

智能AI桌宠,培养你的专属伙伴!视频选集(35/53)自动连播【小白必看】7分生信范文,2小时带你入门!统计之光公开课

1.4万

7

【生信自学系列05】生信小白确定不看看❓吐血整理的GEO数据库知识统计之光公开课

2158

0

10分钟学会使用TCGA肿瘤数据库之利用R语言分析解螺旋官方频道

7717

4

【生信技能树】GEO数据库挖掘生信技能树-jimmy

36.3万

2847

985博后大师兄分享,chatGPT+R语言科研数据作图!做科研的大师兄

9.3万

366

开启数据挖掘之门:GEO、TCGA数据库入门必看!DLab实验室

2.6万

21

医学入门,先学meta还是生信?酸菜教科研

5.2万

14

【生信分析】复旦医学博士手把手教学,保姆级0基础生信入门,让你少走半年弯路!酸菜教科研

8133

0

【简介区免费领课】【基础科研系列课】【解螺旋】医学科研基础入门全面讲解,基本常识,数据套路,论证规律,帮你梳理细分科研领域解螺旋官方频道

27.6万

206

【生物信息学】R语言实战 文章复现 10小时0基础到入门小陈医生想躺平

16.7万

856

【教程】Q1区,肠道菌群和糖尿病:双样本孟德尔随机化研究,5.7分,如何靠孟德尔随机化肠道菌群秒上5分?生信分析275276生信和孟德尔分析

1.1万

4

【生信分析自学计划】王博士带你打卡学生信,新手入门绝佳之选统计之光公开课

3533

1

生信分析发SCI教学 5.GEO数据处理【医学生必看】吉克学长

8637

5

如何“白嫖”别人的数据?GEO数据库挖掘解螺旋官方频道

2.8万

22

这种生信文章见一篇就秒拒一篇生信狂人团队

2.7万

0

临床医生做临床科研——2.相关性分析开心doctor

12.2万

121

推荐一个生信up主,他的分享碾压大多数付费课程小钱读博中

7191

0

临床医生做临床科研——1.生存分析开心doctor

14.0万

273

GEO数据库使用教程解螺旋官方频道

1.9万

2

【0基础生物信息】想看懂生信文章,想亲手写出一篇生信文章?写作思路、套路拿去!解螺旋官方频道

2.1万

8

展开

小窗

客服

顶部

赛事库 课堂 2021

GTEx数据库简介(1) - 知乎

GTEx数据库简介(1) - 知乎切换模式写文章登录/注册GTEx数据库简介(1)HuaMD医学大数据分享医学大数据知识----医学大数据及其综合分析(四)Hua+医学大数据 出品(转载请注明出处链接,翻版必究)(HuaPlusMD通过整合多种人类和动物数据库,建立了可靠的大数据库,为您提供疾病动物模型和临床大数据综合分析。链接:https://www.huaplusmd.com)前言:“大数据”概念早已出现,目前我们对(医学)大数据了解有多少呢?本平台将对医学大数据进行系统的介绍,并对大数据综合分析进行分享(每周更新)。分享的内容将主要涵盖大数据库(基因、蛋白数据库等)/生物银行介绍(UK Biobank, Finnish Biobanks, China Kadoorie Biobank, BioBank Japan, TCGA, GWAS catalog,GTEx等),疾病动物模型数据库(如GeneNetwork, BXD),大数据库的综合使用(如Mendelian randomization),组学数据分析等。(分享的其他系列内容请见:https://www.huaplusmd.com/knowledge) 每个个体的不同的器官组织的基因(Gene)都是相同的,但为什么有的表型为肝脏组织,帮助人类代谢?有的是肌肉组织,帮助人类运动?其原因是,不同的人体组织表达的基因并不相同。GTEx项目,通过收集健康人体的不同组织样本,尝试了解人类不同组织/器官的特异性基因表达。 从本期开始,我们将介绍GTEx数据库。这是一个值得大家深度学习的数据库。GTEx项目,全称Genotype-Tissue Expression (基因型-组织表达) ,主要由美国NIH(国立卫生研究院)的公共基金计划连续资助了10年(2010-2019)的项目。(特别希望我国也能支持,这种长期的大队列的人体基础研究,能使非敏感数据开源,接受国际同行的评议。功在当代、利在千秋!) GTEx项目是用来研究人类不同组织的特异性基因表达和调节的。GTEx 项目最终的数据库(第八版,V8),包括来自于838位生前健康的人类捐献者的DNA数据(包含Whole Genome Sequencing (WGS) 和 Whole Exome Sequencing (WES));17382 份RNA-seq 数据,其来自于近1000个人类个体,涵盖54个不同组织器官部位(目前世界唯一能收集这么全的健康人体组织样本);以及2个来自捐献者血液和皮肤的细胞系。该数据库应用:· 评价不同组织特异性基因表达和调节;· 进行GWAS研究 (genome-wide association study);· 可以用来探索遗传变异对复杂疾病和特征的影响。应用举例:GTEx的研究人员,通过GTEx数据库,设计一种统计方法,称为PrediXcan,该方法能够通过基因序列,推测基因的活性或表达量;然后,PrediXcan能够将推测的基因活性和观测到的疾病特征相关联,从而预测疾病。PrediXcan已经成功地发现与多种疾病相关的特异基因,这些疾病包括 冠状动脉疾病、克罗恩病、类风湿性关节炎、 1 型糖尿病 和 双相情感障碍。 该项目创建了GTEx Portal(https://gtexportal.org/home/),该平台提供开放获取的数据,包括基因表达、QTLs 及 生理组织学 图片。 GTEx项目,也同时建立了自己的生物银行(https://gtexportal.org/home/biobank),包含来自约960位生前健康的捐赠者的组织标本的,包括肺脏、脑、胰腺、皮肤等等。如果需要,还可以申请获取留存的生物样本。GTEx联盟,在世界顶刊上Science, Nature上发表的代表性文章列表:· 2015年,GTEx项目发布了第一个阶段性成果,一次性在Science上发表3篇研究成果:The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humansThe GTEx Consortium.Science. 8 May 2015. 348(6235):648-660. doi:10.1126/science. PMID: 25954001 The human transcriptome across tissues and individualsMelé M, Ferreira PG, Reverter F, DeLuca DS, Monlong J et al.Science. 8 May 2015. 348(6235):660-665. doi: 10.1126/science.aaa0355 Effect of predicted protein-truncating genetic variants on the human transcriptomeRivas MA, Pirinen M, Conrad DF, Lek M, Tsang EK et al.Science. 8 May 2015. 348(6235):666-669. doi:10.1126/science.1261877. · 2017年,GTEx项目发布了进一步成果,一次性在Nature发表4篇研究成果:Genetic effects on gene expression across human tissuesThe GTEx Consortium.Nature. 12 Oct 2017. 550: 204-213. Epub 11 Oct 2017. doi:10.1038/nature24277The impact of rare variation on gene expression across tissuesLi X, Kim Y, Tsang EK, Davis JR, Damani FN et al.Nature. 12 Oct 2017. 550: 239-243. Epub 11 Oct 2017. doi:10.1038/nature24267Landscape of X chromosome inactivation across human tissuesTukiainen T, Villani AC, Yen A, Rivas MA, Marshall JL et al.Nature. 12 Oct 2017. 550: 244-248. Epub 11 Oct 2017. doi:10.1038/nature24265Dynamic landscape and regulation of RNA editing in mammalsTan MH, Li Q, Shanmugam R, Piskol R, Kohler J et al.Nature. 12 Oct 2017. 550:249-254. Epub 11 Oct 2017. doi:10.1038/nature24041· 2019-2022年,GTEx项目又连续发布了项目的成果,在Science发表7篇研究成果:2022Single-nucleus cross-tissue molecular reference maps toward understanding disease gene functionEraslan G, et al.Science. 376 (abl4290), 13 May 2022. doi:10.1126/science.abl42902020The GTEx Consortium atlas of genetic regulatory effects across human tissuesThe GTEx Consortium.Science. 369 (1318-1330), 10 Sep 2020. doi:10.1126/science.aaz1776Cell type specific genetic regulation of gene expression across human tissuesKim-Hellmuth* S, Aguet* F, Oliva M, Muñoz-Aguirre M, Kasela S, et al.Science. 369 (eaaz8528), 10 Sep 2020. doi:10.1126/science.aaz8528Transcriptomic signatures across human tissues identify functional rare genetic variationFerraro* NM, Strober* BJ, Einson J, Abell NS, Aguet F, et al.Science. 369 (aaz5900), 10 Sep 2020. doi:10.1126/science.aaz5900Determinants of telomere length across human tissuesDemanelis K, Jasmine F, Chen LS, Chernoff M, Tong L, et al.Science. 369 (aaz6876), 10 Sep 2020. doi:10.1126/science.aaz6876The impact of sex on gene expression across human tissuesOliva* M, Muñoz-Aguirre* M, Kim-Hellmuth* S, Wucher V, Gewirtz ADH, et al.Science. 369 (aba3066), 10 Sep 2020. doi:10.1126/science.aba30662019RNA sequence analysis reveals macroscopic somatic clonal expansion across normal tissuesYizhak K, Aguet F, Kim J, Hess JM, Kübler K et al.Science. 07 June 2019. 364(6444). doi:10.1126/science.aaw0726 如果你可以看youtube视频,可以看一下Prof. Eric Lander (Funding director, Broad Institute) 等对GTEx的简单介绍:https://www.youtube.com/watch?v=PhK186A7Ryo---end---—如果喜欢,快分享给你的朋友们吧—关注公众号,更多精彩内容等着你!原文链接:https://www.huaplusmd.com/knowledgeHua+医学大数据 出品 (医学大数据综合分析,HuaPlusMD坚持专业和认真)。如果您有医学大数据综合分析方面需求欢迎联系我们:https://www.huaplusmd.com/往期回顾:医学大数据及其综合分析(总纲)医学大数据及其综合分析(一)—— GEO数据库介绍 (1)医学大数据及其综合分析(一)—— GEO数据库介绍 (2)医学大数据及其综合分析(二)—— BXD小鼠数据库介绍 (1)医学大数据及其综合分析(二)—— BXD小鼠数据库/GeneNetwork介绍 (2)医学大数据及其综合分析(二)—— BXD小鼠数据库/GeneNetwork介绍 (3)医学大数据及其综合分析(二)—— BXD小鼠数据库/GeneNetwork介绍 (4)医学大数据及其综合分析(三)—— eQTLGen Consortium数据库简介(1)医学大数据及其综合分析(三)—— eQTLGen Consortium数据库简介(2)医学大数据及其综合分析(X)—— 实例分析1:中年发福:人体代谢率 不背此锅新冠肺炎(COVID-19)的致死率参考文献:[1] https://commonfund.nih.gov/GTex.[2] https://gtexportal.org/home/发布于 2022-10-24 04:27大数据​赞同 20​​2 条评论​分享​喜欢​收藏​申请

GTEx Portal

PortalWe're sorry but gtex doesn't work properly without JavaScript enabled. Please enable it to contin

GTEx:基因型和基因表达量关联数据库-腾讯云开发者社区-腾讯云

:基因型和基因表达量关联数据库-腾讯云开发者社区-腾讯云生信修炼手册GTEx:基因型和基因表达量关联数据库关注作者腾讯云开发者社区文档建议反馈控制台首页学习活动专区工具TVP最新优惠活动文章/答案/技术大牛搜索搜索关闭发布登录/注册首页学习活动专区工具TVP最新优惠活动返回腾讯云官网生信修炼手册首页学习活动专区工具TVP最新优惠活动返回腾讯云官网社区首页 >专栏 >GTEx:基因型和基因表达量关联数据库GTEx:基因型和基因表达量关联数据库生信修炼手册关注发布于 2019-12-19 10:50:507.7K0发布于 2019-12-19 10:50:50举报文章被收录于专栏:生信修炼手册生信修炼手册GTEx全称如下Genotype-Tissue Expression该项目对来自人体多个组合和器官的样本,同时进行了转录组测序和基因分型分析,构建了一个组织特异性的基因表达和调控的数据库。网址如下https://gtexportal.org/home/包含的组织类型和样本个数如下图所示对于所有的样本,主要进行了以下三种分析1. RNA seq通过illumina Truseq试剂盒构建polyA+文库,采用Hiseq 2000/2500进行测序,对于下机数据,采用STAR进行比对,参照选择的是gencode V19版本的gtf文件,进行了以下3个level的定量gene-level,采用RNAseQC软件,对基因的raw count和TPM两种方式进行定量exon-level, 对exon的raw count进行定量transcript-level,采用RSEM进行转录本水平的定量2. genotype通过WGS对样本进行分型, 采用的是GATK germline variants calling的流程,步骤如下bwa-mem alignmentpicard markduplicateBQSRindel realignhaplotypeCaller3. eQTL通过FastQTL软件进行cis-eQTL分析,将基因型和基因表达量进行关联。通过官网可以查看基因表达量和eQTL分析的结果,以TP53为例,每个基因给出了以下3个层级的表达量Isoform ExpressionExon ExpressionJunction Expression

分别对应转录本,外显子,剪切序列的表达量,对于不同组织中的表达量,以热图的形式进行展示,示意如下对于基因结构,也进行了可视化,示意如下eQTL的结果示意如下提供了以下两种可视化方式,第一种是在单个组织内的小提琴图,eQTL violin plot, 示意如下第二种用于多个组织间的比较,Multi-tissue eQTL plot, 示意如下所有的分析结果可以通过官网进行下载,GTEx数据库不仅仅是一个正常组织的基因表达量数据库,其eQTL分析的策略更值得我们借鉴。本文参与 腾讯云自媒体分享计划,分享自微信公众号。原始发表:2019-08-13,如有侵权请联系 cloudcommunity@tencent.com 删除express数据库sql本文分享自 生信修炼手册 微信公众号,前往查看如有侵权,请联系 cloudcommunity@tencent.com 删除。本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!express数据库sql评论登录后参与评论0 条评论热度最新登录 后参与评论推荐阅读LV.关注文章0获赞0目录1. RNA seq2. genotype3. eQTL相关产品与服务数据库云数据库为企业提供了完善的关系型数据库、非关系型数据库、分析型数据库和数据库生态工具。您可以通过产品选择和组合搭建,轻松实现高可靠、高可用性、高性能等数据库需求。云数据库服务也可大幅减少您的运维工作量,更专注于业务发展,让企业一站式享受数据上云及分布式架构的技术红利!产品介绍2024新春采购节领券社区专栏文章阅读清单互动问答技术沙龙技术视频团队主页腾讯云TI平台活动自媒体分享计划邀请作者入驻自荐上首页技术竞赛资源技术周刊社区标签开发者手册开发者实验室关于社区规范免责声明联系我们友情链接腾讯云开发者扫码关注腾讯云开发者领取腾讯云代金券热门产品域名注册云服务器区块链服务消息队列网络加速云数据库域名解析云存储视频直播热门推荐人脸识别腾讯会议企业云CDN加速视频通话图像分析MySQL 数据库SSL 证书语音识别更多推荐数据安全负载均衡短信文字识别云点播商标注册小程序开发网站监控数据迁移Copyright © 2013 - 2024 Tencent Cloud. All Rights Reserved. 腾讯云 版权所有 深圳市腾讯计算机系统有限公司 ICP备案/许可证号:粤B2-20090059 深公网安备号 44030502008569腾讯云计算(北京)有限责任公司 京ICP证150476号 |  京ICP备11018762号 | 京公网安备号11010802020287问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档Copyright © 2013 - 2024 Tencent Cloud.All Rights Reserved. 腾讯云 版权所有登录 后参与评论00