PolyA_DB version 4 catalogs polyA sites (PAS) in several mammalian genomes using both 3' end short-read RNA-seq (3'READS+) data and long-read RNA-seq data (PacBio and Oxford Nanopore). Gene annotations are based on NCBI Gene and RefSeq databases. LR-RNA-seq data are used to connect PAS identified by 3'READS+ to genes and assign PAS type based on gene and splicing structures. PolyA_DB v4 also contains PAS conservation and strength information and is linked to UCSC genome browser for data visualization.
The version 4.1 contains ~1.4 million PAS for human and mouse based on ~360 human and ~450 mouse samples and over 2.3 billion PAS-supporting reads for each. About 20% of those match with current, public LR-RNA-seq data.
Please contact Shan Yu/Bin Tian for questions/comments/suggestions.