How to Detect Point Mutations in Sanger Sequences with CodonCode Aligner
This guide walks you through the process of analyzing Sanger sequences to identify point mutations using CodonCode Aligner. Topics covered on this page include setup requirements, running the analysis, interpreting results, and working with reference sequences.
Prerequisites
To detect point mutations in CodonCode Aligner, you’ll need a project with aligned sequences. This can be done by aligning traces to a reference sequence, or by assembling sequences if no reference sequence is used.
- Create a new project and import your Sanger sequencing traces.
- Add a reference sequence to the project if available. While optional, a reference sequence with coding sequence annotation allows Aligner to report the biological effect of mutations, such as amino acid changes. Reference sequences typically are text sequences in Genbank or EMBL format.
- Use "Make Reference Sequence" from the Sample menu to tell CodonCode Aligner to use the selected text sequence as the reference sequence.
- Build a contig by assembling the imported sequences or aligning them to the reference. Mutation detection only works on contigs.
Running Mutation Detection
After your contig has been assembled or aligned, you can run mutation detection to identify potential point mutations across the aligned sequences. CodonCode Aligner will scan each sample in the contig and generate a summary table of the results.
- In the project view, select the contig or contigs you want to analyze.
- From the Contig menu, choose Find Mutations.
The analysis typically takes just a few seconds for small to medium-sized projects. A progress dialog will appear and can be canceled if needed.
When finished, Aligner will open a new window showing a table that lists each position where mutations were found. For each sample, the table indicates whether the base is homozygous or heterozygous, and what the effect of the mutation is.
For contigs that include a designated reference sequence, the predicted amino acid change will be based on the coding sequence annotation of the reference. If no reference or no coding sequence annotation is present, the prediction will be based on translation starting at the first base in the contig.
Reviewing Results
The mutation results table lists all positions where CodonCode Aligner found a discrepancy between the consensus sequence and one or more sample sequences. For each sample, the table shows whether the base is homozygous or heterozygous, along with the observed variant.
To review a specific mutation in detail, double-click any row in the table. This opens the selected position in the contig view and trace view, based on your double-click preferences.
In the contig view, mutation tags are shown as color-coded boxes:
- Blue boxes indicate homozygous bases
- Pink boxes indicate heterozygous bases
In the example shown here, the analysis classified three samples as heterozygous, and one sample is homozygous.
The trace view lets you confirm the mutation visually. At heterozygous positions, you’ll typically see a clear secondary peak alongside a primary peak, and the affected base typically has a peak of reduced intensity compared to homozygous calls. In the example shown here, the red T peak for the lower three heterozygous samples is significantly weaker than in the homozygous sample on top.
In this example, the Find Mutations function added tags to the bases, but left the original base calls unchanged. Alternatively, you can select to have heterozygous bases replaced by the corresponding ambiguity codes in the settings, as explained on the "Fine tuning mutation detection" page.
To track manual verification of SNPs, you can mark tags as "Confirmed" by selecting the base in question, and:
- right-clicking on the tag, and choosing Confirm tag from the popup menu
- clicking then on the Confirm button in the toolbar (which may require customizing the toolbar first), or
- by using Confirm Tag from the Tag submenu in the Sample menu, or
- by using a keyboard shortcut.
Fixing Classification Errors
CodonCode Aligner's automatic classification may occasionally miss mutations or misclassify bases. You can correct these results directly in the trace or contig view.
False positives
If a base is incorrectly marked as mutated, you can change its tag to indicate a false positive. Right-click the tagged base and choose the menu item like "Mark heterozygoteCT as False Positive".
False positive tags are preserved when you re-run mutation detection.
Wrong classification
To change an incorrect classification (e.g., heterozygoteCT → homozygoteCC), right-click the tag and select "Edit Tag...". In the tag dialog, click "Change..." to select the correct classification from a dropdown menu. Aligner will update the tag and recalculate any amino acid effect shown in the notes.
Missed mutations
If a mutation was not detected, you can add it manually. Right-click the base in the trace or contig view, choose "Add Tag...", and select the correct tag type (e.g., heterozygoteCT). Aligner will automatically add amino acid information based on the defined coding sequence.
Excluding low-quality regions
The accuracy of the automatic SNP detection depends on the quality of the sequences. When sequences contain low-quality sections with irregular peak spacing or double peaks, these often cause wrong results. You can exclude such low-quality regions from mutation detection as follows:
- Select the poor-quality region
- Go to the Tag submenu in the Sample menu, and select Add Don't Genotype Tag.
Note: Tags you add or edit (except "False Positive" and "Don't Genotype" tags") may be deleted if you run "Find Mutations" again on the same contig, depending on your settings, as described on the "Fine tuning mutation detection" page.
If you prefer using buttons, you can customize the toolbars for the contig and trace views to include buttons for the "Confirm", "False positive", and "Don't genotype" actions.
Using Reference Sequences
While a reference sequence is not required to detect point mutations, it is highly recommended. Reference sequences allow CodonCode Aligner to describe the biological effect of each mutation, such as codon changes and amino acid substitutions.
When using a reference, you should:
- Align your samples to the reference instead of assembling. This ensures that annotations from the reference sequence, such as coding regions, are used in the analysis.
-
Use a Genbank-formatted reference sequence if possible. Aligner will
automatically convert simple "CDS" annotations into
codingSequenceandcodonStarttags. Other formats are supported, but annotations may need to be added manually.
Manually Adding codingSequence and codonStart Tags
If your reference sequence is not annotated, you can manually add tags in the contig or base view:
-
To define the coding region, select the relevant bases in the reference sequence,
right-click, and choose "Add Tag...". Then select
codingSequencefrom the tag type menu. -
If the coding sequence does not start at a codon boundary, you can add a
codonStarttag to one of the first three bases. Use theNotesfield to assign a custom base number (e.g.,basenumber=3301).
Mutation detection will use these tags the next time you run "Find Mutations" on the contig.
When correctly aligned and annotated, mutation tags will include codon-level effects where applicable.
Calling Secondary Peaks Without Contigs
In some situations, you may not be able to assemble a contig or align sequences to a reference — for example, when working with a single sequence or when the reference is unavailable. In these cases, you can still identify possible heterozygous positions by calling secondary peaks directly in individual trace files.
To call secondary peaks without a contig:
- In the project view, select one or more trace sequences.
- From the Edit menu, open the Change Bases submenu and choose Call Second Bases.
Aligner will examine each selected sequence and change base calls to ambiguities at all positions where secondary peaks suggest a possible heterozygous base. You can limit the analyzed region by selecting a range of bases in the trace or contig view; only the selected bases will be analyzed.
To remove ambiguities, you can use from the Change Bases submenu. Remove Ambiguities or Undo Auto Edits .
To adjust how sensitive this detection is, choose Change Bases Options... from the same submenu. This allows you to modify the threshold used to detect secondary peaks.
Note that Call Second Bases is more prone to both false positives and false negatives compared to the standard Find Mutations method. Because it analyzes each sequence in isolation, it lacks key information such as peak height reduction, which is critical for accurately identifying heterozygous mutations.
Related Resources
📚 Learning Center: Using CodonCode Aligner
🏔️ Overview: Mutation Detection in CodonCode Aligner
🎬 Video Tutorial: Mutation Detection in CodonCode Aligner
⚙️ Fine-Tuning: Mutation Detection Settings
🔬 In Depth: Mutation Detection Algorithms
🏔️ Overview: Heterozygous Indel Analysis
🛠️ How-To: Analyze Heterozygous Indels