|
This page contains links to custom annotation tracks contributed by the UCSC
Genome Bioinformatics group and by the research community. Click on a track to
display it in the UCSC Genome Browser. To access custom annotation tracks built
on archived assemblies, see the Genome Browser
archives. Please check the Genome Browser standard track set for
additional contributed annotation tracks.
For information on how to create a custom annotation track, see
Displaying Your Own Annotations in the Genome Browser. If you would
like to submit your own custom tracks to this list, contact
genome@soe.ucsc.edu.
Human Genome
Phased haplotypes
of 'Max Planck One' (MP1) genome in hg18 as described in Suk et al.
A comprehensively molecular haplotype-resolved genome of a European individual
Genome Res 2011. RefSeq genes are shown in the first track for reference purposes.
The second track shows the extent of each molecularly phased segment within the genome of MP1.
The two haplotypes of MP1 are shown in two separate tracks (MP1_haplotype_1 and MP1_haplotype_2)
and are colored by base. Phased indels are also included in these haplotypes. All SNPs from MP1
are shown in the fifth track (MP1_all_SNPs). These SNPs are annotated with their dbSNP rs numbers
(or are annotated as novel). Non-synonymous SNPs are colored bright pink if they cause a
potentially damaging mutation and dark pink if they are not predicted to be damaging.
Thanks to the Max Planck Institute for Molecular Genetics for contributing these data.
DNA binding sites in hg18 for nuclear receptor HNF4alpha (NR2A1). The
PBM track
shows in vitro validated sites as determined by protein binding
microarrays (PBMs) (number after sequence indicates relative binding score). The
SVM track
shows predicted sites by support vector machine (SVM) analysis (number after sequence indicates
predicted relative binding score). For more information, see Bolotin E et al. in
Integrated approach for the identification of human hepatocyte nuclear factor 4α
target genes using protein binding microarrays. Hepatology. 2010 Feb;51(2):642-653.
Thanks to the Sladek lab,
University of California Riverside for contributing these data.
Transcribed ultraconserved regions (T-UCRs) reblatted to hg18.
The first track shows intragenic T-UCRs (red); the second one displays
intergenic T-UCRs (blue) (intragenic and intergenic relative to the RefSeq
Genes track). For more information, see
Mestdagh, P. et al. An integrative genomics screen uncovers ncRNA T-UCR functions in
neuroblastoma tumours. Oncogene.
2010 Apr 12. [Epub ahead of print]
Thanks to Erik Fredlund, Pieter Mestdagh, Filip
Pattyn and Jo Vandesompele of the Center for Medical Genetics, Ghent University
Hospital, Ghent, Belgium for contributing these tracks.
Vervet monkey gene expression data (hg18) providing
mean expression differences for 8-16 samples per tissue type (publication
pending). See the
UCLA vervet gene expression atlas project website for more
information. Thanks to Dmitriy Skvortsov from the laboratory of Stanley F.
Nelson in the Department of Human Genetics and Psychiatry at the David Geffen
School of Medicine, UCLA, for contributing this track. The work is a
collaboration with Zugen Chen, Barry Merriman, Lynn Fairbanks, Roger Woods,
and Nelson Freimer.
Nucleosome Exclusion Prediction data sets (hg18) accompanying the paper
Radwan A et al.
Prediction and analysis of nucleosome exclusion regions in the human genome.
BMC Genomics. 2008 Apr 22;9(1):186.
View the Nucleosome Regions tracks to
see the whole genome annotation for nucleosome exclusion regions. View the
Nucleosome Scores tracks to see the nucleosome exclusion scores which were
calculated individually for each nucleotide. This annotation was contributed
by Ahmed Radwan, Akmal Younis, Peter Luykx, and Sawsan Khuri,
at the University of Miami, Miami, FL, USA. Contact Sawsan Khuri at
skhuri@med.miami.edu.
Click on the chromosome
you wish to display. Nucleosome Regions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y
M.
Nucleosome Scores:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y
M.
Results of a
genome-wide association study of bipolar disorder (hg17)
published in Baum AE et al.
A genome-wide association study implicates diacylglycerol kinase
eta (DGKH) and several other genes in the etiology of bipolar disorder.
Mol Psychiatry. 2007 May 8; [Epub ahead of print].
The track shows the results of a two-stage study performed using the
Illumina HumanHap 550K chip. SNPs that replicated in both of two
independent case-control samples are shown, filtered for p-value and odds ratio.
Many thanks for this contribution to Amber Baum, Francis McMahon, and the
Unit on the Genetic
Basis of Mood and Anxiety Disorders, Mood and Anxiety Disorders
Program, U.S. Department of Health and Human Services, National Institute of
Mental Health, National Institutes of Health, Bethesda, MD, USA, and the
Central
Institute for Mental Health, Mannheim, Germany.
Compare data from locus-specific databases with the genotypic and functional
data in the Genome Browser using PhenCode, which consolidates variants from many
curated locus-specific databases and one genome-wide database. Click
here to access the PhenCode query page that lets you select
and display a filtered set of locus variants data in the Genome Browser.
Thanks to Belinda Giardine,
Ross Hardison, Webb Miller, and Cathy Riemer at the
Center for Comparative
Genomics and Bioinformatics, Penn State University, University Park,
PA, USA, for contributing these data.
DISCLAIMER: PhenCode is intended for research purposes only. Although the
data are freely available to all, users should treat the reported mutations with
extreme caution in clinical settings or for any diagnostic or population
screening purpose. This information requires expertise to interpret properly;
clinical diagnosis and/or treatment recommendations should be made only by
medical professionals.
HOX microarray expression data from John Rinn et al.
(hg18
and hg17),
as described in the publication Rinn JL, et al.
Functional demarcation of active and silent chromatin domains in
human HOX loci by noncoding RNAs. Cell. 2007 Jun 29;129(7):1311-23.
See the track description page for more information about the data methods and
verification techniques used. Thanks to John Rinn in the
Chang Lab at Stanford University for contributing these data.
Tracks showing increases and decreases in copy number variants across
five hominoid species (human, bonobo, chimpanzee, gorilla, and
orangutan)
(hg17,
hg16).
We would like to thank the University of Colorado Health
and Sciences Center, Dr. Jim Sikela, and Michael Cox for their
contributions. For further insight, reference this paper:
Fortna, A., Kim, Y., MacLaren, E., Marshall, K., Hahn, G., Meltesen,
L., Brenton, M., Hink, R., Burgers, S., Hernandez-Boussard, T.,
Karimpour-Fard, A., Glueck, D., McGavran, L., Berry, R., Pollack, J.R.
and Sikela, J.M. Lineage-specific gene duplication and loss in human
and great ape evolution. PLoS Biology Jul 2(7):E207, 2004.
The Intronic EST hotspots track (hg17) highlights non-coding
genomic regions that have a high degree of EST coverage (EST hotspot) within
"consensus intronic regions", i.e. regions that are intronic in all
RefSeq transcript variants of a given gene. They are an invaluable tool in
identifying novel coding and non-coding elements within the genome.
This annotation was contributed by Xitong Li and Christina Zheng at
Genomic Health, Inc.
Tracks providing CpG island strength predictions and mapping of bona fide
CpG islands for the human genome (hg17/hg18). The tracks are based on
large-scale epigenome predictions, which give rise to an improved and
quantitative annotation of CpG islands. Additional information on these tracks
is available from the
supplementary website and from the corresponding paper
Bock, C. et al.
CpG island mapping by epigenome prediction to appear in
PLoS Comput Biol. For prioritization of candidate regions,
the quantitative CpG island strength predictions are recommended
(hg17/hg18).
For genome annotation, three maps of bona fide CpG islands are provided: (i) a
highly specific map
(hg17/hg18), (ii) a balanced map recommended for most applications
(hg17/hg18)
and (iii) a highly sensitive map
(hg17/hg18).
Finally, all tracks can be viewed simultaneously
(hg17/hg18),
which may take longer to load.
Three tracks (hg17) accompanying the paper Nakaya HI et al.
Genome mapping and
expression analyses of human intronic noncoding RNAs reveal tissue-specific
patterns and enrichment in genes related to regulation of transcription.
Genome Biol. 2007 Mar 26;8(3):R43.
The TIN_RNAs track shows the genomic mapping coordinates of all
55,139 Totally Intronic Noncoding RNA (TIN RNA) transcripts identified in the
human genome. The
PIN_RNAs track shows the mapping coordinates of all 12,592
Partially Intronic Noncoding RNA (PIN RNA) transcripts. The
TIN_PIN_probes track shows the genomic coordinates
of all TIN and PIN sense and antisense intronic probes plus the exonic
probes in a custom-designed 44K intron-exon oligoarray. This array
was used for gene expression experiments with human prostate, kidney and
liver tissues. Thanks to Sergio Verjovski-Almeida, Eduardo M. Reis, and
Helder I. Nakaya from Instituto de Quimica - Universidade de Sao Paulo for
contributing these data sets.
Copy-Number Variants (hg17) accompanying the paper Wong, K.
et al.
A Comprehensive Analysis of Common Copy-Number Variations in the Human Genome.
American Journal of Human Genetics 80:91-104 (2007). The following color scheme is used to indicate the frequency
with which clones were seen: blue (1 or 2), red (3), green (4 or 5), black (6 or more).
Thanks to Kendy Wong and Ronald deLeeuw for contributing this data.
Data sets (hg17) accompanying the paper Carroll, J.S.
et al.
Genome-wide analysis of estrogen receptor binding sites.
Nature Genet. 38(10) 2006. The set of six custom tracks
shows ER and RNA Pol2 ChIP-chip data at two cutoffs (low and high),
upregulated genes, and downregulated genes. Thanks to the
Myles Brown
lab at the Dana-Farber Cancer Institute, Harvard Medical School, Boston,
MA, USA for contributing these data.
Sliding window analysis of Tajima's D across the human genome
(hg17) and
(hg16).
This track identifies regions putatively subject to strong, recent, selective
sweeps and identified Contiguous Regions of Tajima's D Reduction (CRTRs) in each
of three populations. For details, see the Tajima's D SNPs track on the hg17
and hg16 Genome Browsers, as well as Christopher S. Carlson et al.
Genomic regions exhibiting positive selection identified from dense
genotype data. Genome Res. 15:1553-1565 (2005).
Structural RNAs predicted by RNAz (hg17).
This track displays putative
functional RNA elements with exceptionally stable and/or evolutionary conserved
secondary structure. For a description of the RNAz program, see Washietl, S.,
Hofacker, I.L. and Stadler, P.F.
Fast and reliable prediction of noncoding RNAs.
Proc. Natl. Acad. Sci. USA 102(7), 2454-2459 (2005). Additional
information on how this track has been generated can be found
here.
Thanks to Stefan Washietl, Ivo Hofacker and Peter F. Stadler for contributing
this annotation.
Alternative conserved exons predicted by ACEScan (hg17).
This track displays human exons (from Known Genes with an exonic alignment
to mouse) that have a positive ACEScan score. For a description of
the methods used to generate this annotation, see Yeo, G.W. et al..
Identification and analysis of alternative splicing events
conserved in human and mouse. Proc. Natl. Acad. Sci. USA
102(8), 2850-2855 (2005). The ACEScan online webtool is available at
http://genes.mit.edu/acescan. Thanks to Gene Yeo and Chris
Burge at MIT for contributing this annotation.
Perfect LINEs identified by
GPS
(hg17). This track displays regions in the chromosome in which all the
components have at least 10% identity to the query (Retroid Agent) and no
frame shifts or stop codons in the gene coding regions.
Click on the chromosome you wish to display:
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
20,
22,
X,
Y.
Thanks to Dr. Marcella McClure and Vijay A. Raghavan
at Montana State University for providing this annotation.
Database of Transcribed Sequences (DoTS) Genes (hg17)
generated using BLAT alignments of DoTS RNAs.
Click on the chromosome you wish to display:
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
X,
Y,
M.
Thanks to Y. Thomas Gan for creating this track.
DoTS Genes (hg16).
Click on the chromosome you wish to display:
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
X,
Y,
M.
Thanks to Y. Thomas Gan for creating this track.
Isochore track (hg17) generated using
IsoFinder, a segmentation algorithm
developed by
Grupo de Bioinformatica, Universidad de Granada, Spain.
Click on the chromosome you wish to display:
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
X,
Y. Data on older human assemblies are also available
(hg16,
hg15,
hg13,
hg12).
Thanks to Dr. Jose L. Oliver for contributing this track.
Stanford Human Promoters
(hg16,
hg15,
hg13).
The Stanford Human Promoters data sets were generated by the Richard M. Myers lab at
Stanford University and is described in
Trinklein, N., Force Aldred, S., Saldanha, A., and Myers, R.M. (2003).
Identification and Functional Analysis of Human Transcriptional
Promoters. Genome Res., 13:308-312. Thanks to
Nathan Trinklein at Stanford School of Medicine for contributing this track,
and to Daryl Thomas of UCSC for lifting the hg15 data to the hg16 assembly.
Mouse Ortholog (hg12). Human and Mouse gene predictions based
on fgenesh++ clustered using a BLAT protein alignment and the reciprocal best
matches retained. Thanks to Robert Baertsch for creating this track.
Penn State University Known Regulatory Regions Set 1 (hg12).
This set contains acollection of known regulatory regions gathered from
literature. Set 1 is limited to the smallest recognized segment containing full
function, and was used as the data set for Elnitski L, Hardison RC, Li J,
Yang S, Kolbe D, Eswara P, O'Connor MJ, Schwartz S, Miller W, and Chiaromonte F.
(2003).
Distinguishing Regulatory DNA From Neutral Sites.
Genome Res., 13:64-72. For more information, see
http://bio.cse.psu.edu/mousegroup/Reg_annotations/. Thanks to
Robert Baertsch for creating this track.
Penn State University Known Regulatory Regions Set 2 (hg12).
This set of functional regions contains names and coordinates of an additional
set of regulatory regions that were not trimmed (as in Set 1) to show the
smallest possible functional element with maximum activity. The regions range in
size from 300-4000 bp. For more information, see
http://bio.cse.psu.edu/mousegroup/Reg_annotations/. Thanks to
Robert Baertsch for creating this track.
Mouse Genome
Transcriptome-wide monoallelic expression in CNS-derived stem cells for four clonal
hybrid (B6 x JF1) cell lines is displayed in mm9. The
track shows the allelic preference for cell lines 2A1, 2A5, 3A1 and 4A5 at JF1
cSNP locations. The allelic preference is denoted by the proportion of the B6 allele
vs JF1 allele. For more information, see: Li SM et al.
Transcriptome-wide survey of CNS-derived cells reveals monoallelic
expression within novel gene families. PLoS ONE. 2012 Feb;7(2):e31751.
Genome-wide DNase hypersensitivity in male and female mouse liver mapped
by DNase I treatment of pooled livers from male and female mice coupled
with high-throughput sequencing (DNase-seq). The tracks here are BED
files representing (1)
Liver_DHS_peaks: peaks identified using PeakSeq
(Rozowsky et al.
PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls.
Nat Biotechnol. 2009;27(1):66-75), and (2)
Liver_DHS_regions:
broader regions of hypersensitivity identified using SICER (Zang et al.
A clustering approach for identification of enriched domains from histone modification ChIP-Seq data.
Bioinformatics. 2009;25(15):1952-8), that are sex-independent (gray) and sex-specific
(blue for male-specific, pink for female-specific; darker shade for
higher stringency for sex-specificity). For more information, see Ling
G, Sugathan A, Mazor T, Fraenkel E, Waxman DJ.
Unbiased, genome-wide in vivo mapping of transcriptional regulatory elements reveals sex
differences in chromatin structure associated with sex-specific liver gene expression.
Mol Cell Biol. 2010 Dec;30(23):5531-44.
DMRT1 is a transcription factor that is expressed in germ cells and Sertoli cells and
plays multiple roles in testis development. This study analyzed DMRT1 genome wide
promoter occupancy in the mouse testis at postnatal day 9 as determined by ChIP-chip
on Nimblegen mouse promoter arrays. The three WIG traces
[1]
[2]
[3]
are from three
independent biological replicates and displayed on mouse genome assembly mm8. The WIG
traces represent the enrichment for each probe on the array calculated as the
log-ratio of the intensities of the DMRT1 ChIP product (Cy5) to control input
chromatin (Cy3). More details and gene expression analysis can be found on
the associated interactive web site: www.dmrt1.umn.edu and in the publication: Murphy MW, Sarver AL,
Rice D, Hatzi K, Ye K, Melnick A, Heckert LL, Zarkower D, Bardwell VJ.
Genome wide analysis of DNA binding and transcriptional regulation by the
mammalian Doublesex homolog DMRT1 in the juvenile testis. PNAS.
2010 July 2. [Epub ahead of print] PNAS:1006243107.
Farnesoid X receptor (FXR) is a bile acid-activated transcription factor
belonging to the nuclear receptor superfamily. FXR is highly expressed in liver
and intestine, and crosstalk mediated by FXR in these two organs is critical in
maintaining bile acid homeostasis. This study analyzed genome-wide FXR binding
in liver and intestine of mice treated with a synthetic FXR ligand (GW4064) by
chromatin immunoprecipitation coupled to massively parallel sequencing
(ChIP-seq). The
Fxr Liver
and
Fxr Intestine
tracks shown here are WIG files that represent the number of
times a particular 35bp fragment of DNA was sequenced in the reaction. More
details can be found in the publication
Thomas AM, Hart SN, Kong B, Fang J, Zhong XB, Guo GL.
Genome-wide tissue-specific farnesoid X receptor binding in mouse
liver and intestine.
Hepatology. 2009 Nov 30. [Epub ahead of print] PMID: 20091679.
Thanks to Ann Thomas and Steven Hart in the Department of Pharmacology,
Toxicology, and Therapeutics at the University of Kansas Medical Center,
Kansas City, KS, for contributing these tracks.
An experiment looking at four different ages of mouse liver to observe
how different histone modifications (DNA methylation, H3K4me2, and
H3K27) change across postnatal development (mm9).
A ChIP-on-chip tiling array for three mouse chromosomes
(chr5,
chr12,
chr15)
was used. The tracks show three types of data: 1) a genomic region with a
sequence of >800 bp and an average signal increase greater than the
threshold, defined as an interval, 2) a genomic region with one or more
enriched intervals in close proximity to each other (at least one base overlap)
at any given age, defined as an active region, and 3) peak values for each
interval. For more information, see Li Y, Cui Y, Hart SN, Klaassen CD, Zhong X.
Dynamic patterns of histone methylation are associated with
ontogenic expression of the Cyp3a genes during mouse liver maturation.
Mol Pharmacol. 2009 May;75(5):1171-1179.
Thanks to Steven Hart in the Department of Pharmacology, Toxicology, and
Therapeutics at the University of Kansas Medical Center, Kansas City, KS, for
contributing these tracks.
BayGenomics mouse
knockout gene tags (mm7), generated using BLAT alignments of sequences derived
from gene-trap vector insertions into thousands of genes in mouse embryonic
stem cells. Click on the chromosome you wish to display:
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
X,
Y,
M.
Thanks to the BayGenomics bioinformatics group at UCSF for providing this track.
BayGenomics mouse
knockout gene tags (mm6), generated using BLAT alignments of sequences derived
from gene-trap vector insertions into thousands of genes in mouse embryonic
stem cells. Click on the chromosome you wish to display:
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
X,
Y,
M.
Thanks to the BayGenomics bioinformatics group at UCSF for providing this track.
Locations of known, suspected, and imputed SNPs generated by BLAT alignment
of 3 million Celera associated sequences to the May 2004 mouse genome
assembly (mm5), provided by The GeneNetwork and WebQTL. Only those SNPs that
distinguish strains C57BL/6J from DBA/2J (1.75 million) or that distinguish
C57BL/6J from A/J (1.80 million) are displayed in the custom track.
Due to the proprietary nature of these data, only low resolution
position data (SNP density per 100,000 to 300,000 bp) are currently provided.
This custom track is available on any PHYSICAL and GENETIC maps in WebQTL for
the BXD and AXB/BXA genetic reference panels simply by clicking on
interval maps.
Click on the chromosome you wish to display:
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
X.
Thanks to Celera Genomics (Richard Mural and
Paul Thomas) for this level of access to CDS data and to Christopher
Vincent (Georgia Tech), Alex G. Williams (UCSC); Robert Crowell
(UTHSC and MIT), Gary Churchill and Natalie Blades (The Jackson
Laboratory), and the WebQTL group at UTHSC (Jintao Wang, Yanhua Qu,
Yan Cui, Robert Williams, and Kenneth Manly) for contributing this
track.
Isochore track (mm5) generated using
IsoFinder, a segmentation algorithm
developed by
Grupo de Bioinformatica, Universidad de Granada, Spain.
Click on the chromosome you wish to display:
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
X.
Data are also available for the
mm3 assembly.
Click
here for more information about this annotation.
Thanks to Dr. Jose L.
Oliver for contributing this track.
Rat Genome
Isochore track (rn3) generated using
IsoFinder, a segmentation algorithm developed by
Grupo de Bioinformatica, Universidad de Granada, Spain.
Click on the chromosome you wish to display:
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
X.
Data are also available for the
rn2 and
rn1 assemblies.
Click
here for more information about this annotation.
Thanks to Dr. Jose L. Oliver for contributing this track.
Yeast Genome
A compiled and systematic reference map of
nucleosome positions
across the Saccharomyces genome. Thanks to Cizhong Jiang and B. Franklin
Pugh of the Center for Eukaryotic Gene Regulation, Department of
Biochemistry and Molecular Biology, The Pennsylvania State University,
University Park, PA for contributing this annotation. This work was supported
by a grant from NIH (HG004160). The contributors would like to thank members of
the Pugh lab for their numerous helpful comments.
Multi-Species Annotations
A Hidden Markov Model (HMM) based method was used to look for CpG
islands (CGI) from DNA sequences. Two HMMs are fitted for GC content and
observed to expected ratios of CpG counts. The CGIs were detected by jointly
thresholding the result posterior probabilities. Unlike the current CGI
definition which was derived from studying promoters of known human genes,
this method is data-driven and can be applied to species with different
sequence compositions. For details please see Wu H, Caffo B, Jaffee HA, Feinberg AP, Irizarry RA.
Redefining CpG Islands Using Hidden Markov Models. Biostatistics
2010 July 3;11(3):499-514.
H. sapiens (human) hg19
H. sapiens (human) hg18
P. troglodytes (chimpanzee) panTro2
P. abelii (Orangutan) ponAbe2
R. macaque (monkey) rheMac2
M. musculus (mouse) mm9
M. musculus (mouse) mm8
C. familiaris (dog) canFam2
E. caballus (horse) equCab2
D. melanogaster (fruit fly) dm3
C. elegans (worm) ce2
Lists of CpG islands and an R software package can be download from
http://rafalab.jhsph.edu/CGI/.
| |