Whole Genome Sequencing
WGSAlso known as: full genome sequencing
A sequencing method that determines the complete DNA sequence of an organism's genome, providing comprehensive variant and structural information.
Whole Genome Sequencing (WGS) is the process of determining the complete nucleotide sequence of an organism’s genome in a single experiment 1.
How It Works
Genomic DNA is extracted, fragmented, and prepared into sequencing libraries with platform-specific adapters. Short-read platforms such as Illumina produce reads of 150-300 base pairs with high accuracy and throughput. Long-read platforms like PacBio and Oxford Nanopore generate reads exceeding 10 kilobases, improving assembly across repetitive regions and structural variants.
Reads are either aligned to a reference genome for resequencing applications or assembled de novo when no reference exists. Sufficient depth of coverage (typically 30x or more for variant calling) ensures reliable detection of single-nucleotide variants, insertions, deletions, and structural rearrangements.
In synthetic biology, WGS verifies engineered strain genotypes, detects off-target mutations introduced during genome editing, confirms pathway integration sites, and monitors genomic stability during adaptive laboratory evolution experiments.
Computational Considerations
WGS data analysis pipelines typically follow the GATK best practices workflow: read alignment with BWA-MEM, duplicate marking, base quality score recalibration, and variant calling with HaplotypeCaller 2. Structural variant detection requires specialized tools such as Manta or Delly. Annotation pipelines (SnpEff, VEP) predict functional consequences of identified variants. For engineered organisms, custom pipelines map reads against designed construct sequences to verify correct assembly and integration.
Woolf Software builds computational pipelines for biological data analysis and experimental design optimization. Get in touch.
Assembly and variant-calling pipelines process raw sequencing reads into annotated genomes, using algorithms like BWA, GATK, and de novo assemblers to detect mutations and structural changes.
Related Terms
References
- Goodwin S, McPherson JD, McCombie WR.. Coming of age: ten years of next-generation sequencing technologies . Nature Reviews Genetics (2016) DOI
- McKenna A, Hanna M, Banks E, et al.. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data . Genome Research (2010) DOI