Post-doctoral Fellow Howard Hughes Medical Institute / Indiana University Bloomington, Indiana
Body of Abstract: In eukaryotes, the 45S ribosomal genes occur in large repeat arrays called nucleolus organizer regions (NORs), which remain among the most difficult regions of genomes to assemble. Arabidopsis thaliana has two such regions, NOR2 and NOR4, whose sequences remain undefined. Using ultra-long DNA sequencing combined with an unconventional variant-calling based assembly approach, we completed 5.5 and 3.9 Mbp sequences for NOR2 and NOR4, revealing their structure and effectively finishing the A. thaliana genome for the reference strain, Col-0. NOR2 is epigenetically silenced, and this assembly has allowed analysis of NOR silencing at individual gene copies, demonstrating correlations between NOR structure and activity. Identification of active genes through flow-sorting of nucleoli and RNA sequencing demonstrates that only the central region of NOR4 is active in adult plants whereas most, but not all, NOR2 genes are epigenetically silenced. These active regions overlap with long intervals of low cytosine methylation and 45S gene copy homogenization. Collectively, the data reveal the genetic and epigenetic landscapes of the NORs and implicate transcription in 45S rRNA gene concerted evolution.