The human tissue specific proteome

All, approximately 20000, human genes are classified according to their expression across all major organs and tissue types in the human body. Few of the genes are strictly tissue specific, however, the genes with an elevated expression in particular tissues are interesting as a starting point to understand their biology and function, and underlying mechanisms for disease.

  • A total of 11069 genes are elevated in at least one of the analyzed tissues of which:
  • 2845 are tissue enriched genes
  • 1637 are group enriched genes
  • 6587 are enhanced genes


Transcriptome analysis of all major organs and tissue types in the human body can be visualized with regard to specificity and distribution of transcribed mRNA molecules across all putative 19670 protein coding genes (Figure 1). Specificity illustrates the number of genes with elevated or non-elevated expression in a particular tissue compared to other tissues. The analysis includes 11069 genes, and 8385 genes with low tissue specificity (read more in The housekeeping proteome). Elevated expression includes three subcategory types of elevated expression:

  • Tissue enriched: At least four-fold higher mRNA level in a particular tissue compared to any other tissue.
  • Group enriched: At least four-fold higher average mRNA level in a group of 2-5 tissues compared to any other tissue.
  • Tissue enhanced: At least four-fold higher mRNA level in a particular tissue compared to the average level in all other tissues.

Distribution, on the other hand, visualizes how many genes that have, or do not have, detectable levels (NX≥1) of transcribed mRNA molecules. As evident in Table 1, all elevated genes are categorized as:

  • Detected in single: Detected in a single tissue
  • Detected in some: Detected in more than one but less than one third of tissues
  • Detected in many: Detected in at least a third but not all tissues
  • Detected in all: Detected in all tissues

A. Specificity

B. Distribution

Figure 1. (A) The distribution of all genes across the five categories based on transcript specificity in all 37 analyzed tissues. (B) The distribution of all genes across the six categories based on transcript detection (NX≥1) in all 37 analyzed tissues.


Table 1. Number of genes in the subdivided categories of elevated expression in all 37 analyzed tissues.

Distribution in the 37 tissues
Detected in singleDetected in someDetected in manyDetected in all Total
Specificity
Tissue enriched 5861363739157 2845
Group enriched 092264372 1637
Tissue enhanced 149110133711966 6587
Total 735338647532195 11069

The amount of tissue elevated genes is highly variable between the analyzed tissue types (see Table 2 below). Testis shows the highest number of tissue enriched genes (n=950), followed by the brain (n=488) and liver (n=242). When taking into consideration all tissue elevated genes, the brain however has a slightly higher number than the testis. The large number of enriched genes in testis is considered to be due to the highly specialized processes occurring during spermatogenesis. Many of these genes likely have a shared expression with oocytes in the female ovaries. Ocytes are however difficult to analyze because of the complex kinetics of female germ cell development, including first rounds of meiosis, which in females occur at the embryonic stage. As expected, tissues that have similar functions and morphology often have higher numbers of shared group enriched genes.

In addition to previously known proteins, the analysis also identified a large number of genes with tissue elevated expression patterns that were previously poorly characterized and with no or only scarce evidence of existence at protein level. The combined RNA and antibody-based profiling can thus be used to confirm the physiological functions of such protein coding genes lacking previous annotation. These proteins are interesting starting points for further in-depth studies to gain a better understanding of the molecular mechanisms of the various cellular phenotypes that define the function of each respective tissue and organ.


Table 2. Tissue elevated genes.

Tissue Tissue
enriched
Group
enriched
Tissue
enhanced
Total
elevated
Brain 488 496 1603 2587
Retina 87 79 144 310
Pituitary gland 26 111 216 353
Thyroid gland 13 32 154 199
Parathyroid gland 22 34 168 224
Adrenal gland 9 84 135 228
Lung 13 61 165 239
Salivary gland 42 78 199 319
Esophagus 5 57 249 311
Tongue 11 165 199 375
Stomach 17 52 90 159
Intestine 122 200 442 764
Liver 242 177 517 936
Gallbladder 3 41 117 161
Pancreas 64 93 265 422
Kidney 53 131 229 413
Urinary bladder 1 18 80 99
Testis 950 399 925 2274
Epididymis 77 104 231 412
Prostate 11 25 84 120
Seminal vesicle 2 36 142 180
Ductus deferens 0 34 81 115
Breast 19 51 117 187
Vagina 3 17 71 91
Cervix, uterine 0 27 104 131
Endometrium 2 14 69 85
Fallopian tube 6 75 231 312
Ovary 2 26 145 173
Placenta 91 99 304 494
Heart muscle 29 133 225 387
Skeletal muscle 111 202 594 907
Smooth muscle 0 14 98 112
Adipose tissue 2 50 160 212
Skin 113 125 309 547
Bone marrow 29 135 370 534
Lymphoid tissue 123 333 963 1419
Blood 57 397 944 1398
Total 2845 1637 6587 11069


Tissue elevated genes

The comprehensive analysis presented here has identified 11069 human genes that display a tissue elevated expression pattern across the human body. By combining the analysis with antibody-based protein profiling using immunohistochemistry, the exact location of the corresponding protein expression pattern at a cellular and subcellular level can be provided. Examples of protein expression patterns of tissue elevated genes are presented below.

Brain

  • GFAP (Glial fibrillary acidic protein) - astrocyte intermediate filament protein
  • MBP (Myelin basic protein) - a major constituent of the myelin sheath
  • ELAVL3 (ELAV like RNA binding protein 3) - neural-specific RNA-binding protein


GFAP - cerebral cortex

MBP - hippocampus

ELAVL3 - cerebral cortex

Retina

  • RHO (Rhodopsin) – involved in phototransduction in rod photoreceptors
  • ARR3 (Arrestin 3) – involved in phototransduction in cone photoreceptors


RHO - retina

ARR3 - retina

Endocrine tissues

  • FSHB (Follicle stimulating hormone beta subunit) – hormone inducing egg and sperm production
  • TG (Thyroglobulin) - substrate for the synthesis of thyroid hormones
  • HSD3B2 (Hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta-isomerase 2) - involved in the biosynthesis of hormonal steroids


FSHB - pituitary gland

TG - thyroid gland

HSD3B2 - adrenal gland

Lung

  • SFTPA1 (Surfactant protein A1) - involved in surfactant homeostasis and the defense against respiratory pathogens
  • SFTPB (Surfactant protein B) - involved in surfactant homeostasis and the defense against respiratory pathogens


SFTPA1 - lung

SFTPB - lung

Proximal digestive tract

  • STATH (Statherin) - inhibits precipitation of calcium phosphate salts in the saliva
  • KRT4 (Keratin 4) - expressed in differentiated layers of mucosal and esophageal epithelia


STATH - salivary gland

KRT4 - esophagus

Gastrointestinal tract

  • PGA4 (Pepsinogen 4, group I (pepsinogen A)) - enzyme for digestion of dietary proteins
  • DEFA5 (Defensin alpha 5) - antimicrobial and cytotoxic peptide involved in host defense
  • KRT20 (Keratin 20) - maintains keratin filament organization in intestinal epithelia


PGA4 - stomach

DEFA5 - duodenum

KRT20 - colon

Liver & gallbladder

  • ALB (Albumin) - plasma protein
  • CYP2A13 (Cytochrome P450 member) - involved in drug metabolism, cholesterol and steroid synthesis
  • CHST4 (Carbohydrate sulfotransferase 4) - enzyme involved in the modification of glycan structures


ALB - liver

CYP2A13 - liver

CHST4 - gallbladder

Pancreas

  • AMY2A (Amylase, alpha 2A) - an enzyme that digests carbohydrates, secreted by exocrine cells
  • INS (Insulin) - involved in lowering of blood glucose, secreted by beta cells
  • GCG (Glucagon) - involved in the elevation of blood glucose, secreted by alpha cells


AMY2A - pancreas

INS - pancreas

GCG - pancreas

Kidney & urinary bladder

  • SLC22A13 (Solute carrier family 22 member 13) - membrane-bound organic anion transporter
  • NPHS2 (Podocin) - involved in the regulation of glomerular permeability
  • UPK2 (Uroplakin 2) - membrane protein preventing cell rupture during bladder distention


SLC22A13 - kidney

NPHS2 - kidney

UPK2 - urinary bladder

Male tissues

  • DMRT1 (Doublesex- and mab-3-related transcription factor 1) - involved in meiosis
  • SEMG1 (Semenogelin I) - predominant protein in semen
  • KLK3 (Kallikrein related peptidase 3) - also called PSA, used clinically to diagnose prostate cancer


DMRT1 - testis

SEMG1 - seminal vesicle

KLK3 - prostate

Female tissues

  • CSH1 (Chorionic somatomammotropin hormone 1 ) - hormone important for growth control during pregnancy
  • OVGP1 (Oviductal glycoprotein 1) - mucus protein important in mucociliary transport of the fertilized ovum
  • MUM1L1 (MUM1 like 1) - a protein with a mutated melanoma-associated antigen 1 domain, associated with cancer


CSH1 - placenta

OVGP1 - fallopian tube

MUM1L1 - ovary

Muscle tissues

  • TNNI3 (Troponin I3, cardiac type) - mediates muscle relaxation
  • TNNT2 (Troponin T2, cardiac type) - mediates muscle contraction
  • MYH7 (Myosin heavy chain 7) - expressed in slow type I muscle fibers


TNNI3 - heart muscle

TNNT2 - heart muscle

MYH7 - skeletal muscle

Adipose & soft tissue

  • FABP4 (Fatty acid binding protein 4) - involved in fatty acid uptake, transport, and metabolism
  • PLIN1 (Perilipin 1) - coats lipid storage droplets in adipocytes


FABP4 - adipose tissue (soft tissue)

PLIN1 - adipose tissue (breast)

Skin

  • KRT1 (Keratin 1) - involved in squamous differentiation and skin barrier function
  • KRT27 (Keratin 27) - plays a role in hair formation
  • CASP14 (Caspase 14) - involved in keratinocyte differentiation and cornification


KRT1 - skin

KRT27 - hair

CASP14 - skin

Bone marrow & lymphoid tissues

  • MPO (Myeloperoxidase) - major component of neutrophil azurophilic granules
  • CD8B (CD8b molecule) - plays a critical role in thymic selection of CD8+ T-cells
  • CD22 (CD22 molecule) - mediates interactions between B-cells


MPO - bone marrow

CD8B - thymus

CD22 - lymph node


Group enriched proteins

The 1637 genes identified as group enriched reflect genes with shared expression in 2-5 tissues. Many of these genes encode proteins that are expressed in cell types that have similar functions across several tissues, such as proteins expressed in immune cells (present in many organs but especially lymphoid tissues and the gastrointestinal tract) tissues), proteins involved in squamous cell differentiation (e.g. cervix, esophagus and skin), glandular cell function in the gastrointestinal tract (duodenum, small intestine and colon) or cilia movement (testis and fallopian tube). The schematic network plot below shows the distribution between group enriched genes in different tissues.

Figure 2. An interactive network plot of the tissue enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of tissue enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 3 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.


Immune cells can be found in both lymphoid organs and organs infiltrated by immune cells, such as the intestine. Consequently, genes important for immune cell function are often enriched in both lymphoid tissues and the intestine. One such gene is MS4A1, which encodes CD20, an activated-glycosylated phosphoprotein expressed on the surface of B-cells beginning at the pro-B phase with progressively increasing concentrations until maturity.


MS4A1 - lymph node

MS4A1 - appendix

MS4A1 - small intestine

Squamous epithelia are found in many parts of the body as dry skin or wet mucosa, acting as a robust barrier against various chemical and mechanical stresses. Desmocollin 3, DSC3, encoding a protein important in cell-cell junctions and cellular adhesion, is group enriched in squamous epithelia, such as the esophagus and skin exemplified below.


DSC3 - esophagus

DSC3 - skin

Mucus has several functions in the body related to transportation and barrier functions. The function of the mucus in the salivary gland is related to food and pathogens, while the mucus in the cervix is involved in for example transportation and blockage of sperm during sexual reproduction. MUC16 is a mucus component and is group enriched in both the mucus-producing salivary gland and cervix.


MUC16 - salivary gland

MUC16 - cervix

The fallopian tube shares many elevated genes with testis. The common denominator is the utilization of cilia, or the structurally similar flagellum, for essential organ functions. DNAI2, a dynein protein, constitutes a motor protein component of motile cilia of multiciliated cells as well as the flagellum (tail) of the sperm. By pulling on the microtubule structure of the cilium/flagellum, the motor protein creates motion and in the case of the sperm, sperm motility. In the immunohistochemistry images below, expression of DNAI2 can be seen in a subset of cilia in the fallopian tube (left and middle image), as well as in the flagellum of spermatids and cytoplasm of differentiating spermatocytes (right image).


DNAI2 - fallopian tube

DNAI2 - fallopian tube ciliated cells

DNAI2 - testis


Relevant links and publications

Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
PubMed: 25613900 DOI: 10.1126/science.1260419

Bergman J et al., The human adrenal gland proteome defined by transcriptomics and antibody-based profiling. Endocrinology. (2016)
PubMed: 27901589 DOI: 10.1210/en.2016-1758

Edqvist PH et al., Expression of human skin-specific genes defined by transcriptomics and antibody-based profiling. J Histochem Cytochem. (2015)
PubMed: 25411189 DOI: 10.1369/0022155414562646

Lindskog C et al., The human cardiac and skeletal muscle proteomes defined by transcriptomics and antibody-based profiling. BMC Genomics. (2015)
PubMed: 26109061 DOI: 10.1186/s12864-015-1686-y

Sjöstedt E et al., Defining the Human Brain Proteome Using Transcriptomics and Antibody-Based Profiling with a Focus on the Cerebral Cortex. PLoS One. (2015)
PubMed: 26076492 DOI: 10.1371/journal.pone.0130028

Zieba A et al., The Human Endometrium-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling. OMICS. (2015)
PubMed: 26488136 DOI: 10.1089/omi.2015.0115

O'Hurley G et al., Analysis of the Human Prostate-Specific Proteome Defined by Transcriptomics and Antibody-Based Profiling Identifies TMEM79 and ACOXL as Two Putative, Diagnostic Markers in Prostate Cancer. PLoS One. (2015)
PubMed: 26237329 DOI: 10.1371/journal.pone.0133449

Habuka M et al., The Urinary Bladder Transcriptome and Proteome Defined by Transcriptomics and Antibody-Based Profiling. PLoS One. (2015)
PubMed: 26694548 DOI: 10.1371/journal.pone.0145301

Andersson S et al., The transcriptomic and proteomic landscapes of bone marrow and secondary lymphoid tissues. PLoS One. (2014)
PubMed: 25541736 DOI: 10.1371/journal.pone.0115911

Habuka M et al., The kidney transcriptome and proteome defined by transcriptomics and antibody-based profiling. PLoS One. (2014)
PubMed: 25551756 DOI: 10.1371/journal.pone.0116125

Mardinoglu A et al., Defining the Human Adipose Tissue Proteome To Reveal Metabolic Alterations in Obesity. J Proteome Res. (2014)
PubMed: 25219818 DOI: 10.1021/pr500586e

Kampf C et al., Defining the human gallbladder proteome by transcriptomics and affinity proteomics. Proteomics. (2014)
PubMed: 25175928 DOI: 10.1002/pmic.201400201

Lindskog C et al., The lung-specific proteome defined by integration of transcriptomics and antibody-based profiling. FASEB J. (2014)
PubMed: 25169055 DOI: 10.1096/fj.14-254862

Gremel G et al., The human gastrointestinal tract-specific transcriptome and proteome as defined by RNA sequencing and antibody-based profiling. J Gastroenterol. (2014)
PubMed: 24789573 DOI: 10.1007/s00535-014-0958-7

Kampf C et al., The human liver-specific proteome defined by transcriptomics and antibody-based profiling. FASEB J. (2014)
PubMed: 24648543 DOI: 10.1096/fj.14-250555

Djureinovic D et al., The human testis-specific proteome defined by transcriptomics and antibody-based profiling. Mol Hum Reprod. (2014)
PubMed: 24598113 DOI: 10.1093/molehr/gau018

Fagerberg L et al., Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics. (2014)
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600

Danielsson A et al., The human pancreas proteome defined by transcriptomics and antibody-based profiling. PLoS One. (2014)
PubMed: 25546435 DOI: 10.1371/journal.pone.0115421

Microscopical images of normal tissue - Tissue Dictionary (Human Protein Atlas)

GTEx Portal

Fantom

UniProt

Allen Brain Atlas