polyAdbv4_logo
Search PolyA_DB
Gene Symbol (e.g. 'Cstf3'), Ensembl Gene ID (e.g. 'ENSG00000176102'), RefSeq Gene ID (e.g. '1479')
Release version: v4.1 (September 15, 2025)

PolyA_DB version 4 catalogs polyA sites (PAS) in several mammalian genomes using both 3' end short-read RNA-seq (3'READS+) data and long-read RNA-seq data (PacBio and Oxford Nanopore). Gene annotations are based on NCBI Gene and RefSeq databases. LR-RNA-seq data are used to connect PAS identified by 3'READS+ to genes and assign PAS type based on gene and splicing structures. PolyA_DB v4 also contains PAS conservation and strength information and is linked to UCSC genome browser for data visualization.

The version 4.1 contains ~1.4 million PAS for human and mouse based on ~360 human and ~450 mouse samples and over 2.3 billion PAS-supporting reads for each. About 20% of those match with current, public LR-RNA-seq data.


References

  1. Yu S, Chen W, Wang L, Jewell S, Mammedova A, Han S, Wickramasinghe J, Barash Y, Tian B. (2025). PolyA_DB v4: systematic polyA site identification and isoform annotation in human and mouse genomes using 3’ end and long-read sequencing data
  2. Zheng D, Liu X, Tian B. (2016). 3'READS+, a sensitive and accurate method for 3' end sequencing of polyadenylated RNA. RNA.22:1631-9.
  3. Hoque M*, Ji Z*, Zheng D, Luo W, Li W, You B, Park JY, Yehia G, Tian B. (2013). Analysis of alternative cleavage and polyadenylation by 3' region extraction and deep sequencing. Nature Methods 10:133-9.

Please contact Shan Yu/Bin Tian for questions/comments/suggestions.