post-doctorate King Abdullah University of Science and Technology Thuwal, Makkah, Saudi Arabia
Recent advancements in sequencing techniques and platforms continue to generate larger amounts of sequencing data per lane, enabling to the sequencing of many individuals together through a multiplexing library. We developed a skim-sequencing method using low-coverage whole genome sequencing that can be efficiently scaled up to thousands of samples; the size of populations required by plant breeding and research programs. The library preparation utilizes a low-volume multiplexed Nextera library, which is cost-effective and very high-throughput. With dual indexes of 96 samples by 32 plates or more, a sufficiently large number of samples can be multiplexed to be cost effective even with the largest output sequencing available. Along with the library preparation method, a bioinformatics pipeline was developed to process and employ the data for genetics and genomics studies. In the pipeline, we adopted several existing software tools designed for sequence data processing and mapping. We demonstrate several applications for skim-sequencing, including genotyping of segregating populations, introgression mapping and dosage estimation. We successfully genotyped wheat doubled haploids (DHs), wild einkorn RILs, wheat back-cross lines (BCs), the wheat-barley Robertsonian and interstitial translocation lines, and aneuploids using the skim-sequencing method. More information can be found in a recently published article (https://www.nature.com/articles/s41598-022-19858-2) and data scripts available at (https://github.com/sandeshsth/SkimSeq_Method). Overall, skim-seq is a cost-effective approach that can complement and replace current high-throughput genotyping approaches to enable large-scale genetics research and genomics-assisted breeding.