Skip to content

fix trimming of MSA

Workum, Dirk-Jan van requested to merge fix_trimming_count_msa into develop

Even though we are planning to restructure the MultipleSequenceAlignment.java class soon, there is a serious bug in this class: the trimming of protein sequences is too extreme. To be precise: 3x too many sequences are cut off. I discovered this is caused by old code in the trimSequence() method that was not removed when the trimming of nucleotide and protein sequences was separated by having two separate modi for this (!142 (merged)).

For now, putting this merge request on draft as I want to vigorously check that the trimming is done correctly now in all cases. This merge request is ready for review! I thoroughly checked all combinations of pangenome/panproteome and nucleotide/variant/protein alignment to ensure trimming is performed correctly now. The pipeline fails for panproteome as expected, so this means that the reference commit will have to be updated after merging this merge request.

Edited by Workum, Dirk-Jan van

Merge request reports