Ampseer examines reads in fastq format and identifies which multiplex PCR primer set was used to generate the SARS-CoV-2 sequencing library they are read from.

New England Biolabs Inc.

Last update: Nov 2, 2022

Related tags

Overview

ampseer

Ampseer examines reads in fastq format and identifies which multiplex PCR primer set was used to generate the SARS-CoV-2 sequencing library they are read from. It is intended to differentiate between ARTIC v3, ARTIC v4, ARTIC v4.1, VarSkip 1a, VarSkip 2, Midnight and VarSkip Long primer sets. When compiled with --release optimizations, Ampseer processes reads at the same speed as samtools fastq ( < 4s for a 155M bam file on 2019 Macbook Pro)

time samtools fastq tests/fixtures/vss2_large.bam \
| target/release/ampseer --reads /dev/stdin \
          --primer-sets primer_sets/*.fasta
...
samtools fastq tests/fixtures/vss2_large.bam  3.69s user 0.12s system 99% cpu 3.839 total
target/release/ampseer --reads /dev/stdin --primer-sets primer_sets/*.fasta  3.44s user 0.11s system 92% cpu 3.841 total

Note: Ampseer will produce "unknown" unless one primer set can be clearly separated from other candidates. It will not be able to identify differences between related sets unless both candidate sets are included. For example, ampseer will identify a ARTIC v4.1 library as ARTIC v4 unless both primer sets are included as candidates.

Ampseer can be called like this:

Example Commands:

run the program:

cargo build --release;
samtools fastq tests/fixtures/vss2_small.bam
| target/release/ampseer --reads /dev/stdin \
          --primer-sets primer_sets/*.fasta

run the tests

cargo test

make a flamegraph (--root needed on MacOS)

samtools fastq tests/fixtures/vss2_small.bam
| CARGO_PROFILE_RELEASE_DEBUG=true cargo flamegraph --root \
  -- --reads /dev/stdin \
     --primer-sets primer_sets/*.fasta

Comments

Failing tests

After running cargo test:

running 10 tests test test_cli ... ok test test_help_exit_code ... ok test missing_reads ... ok test ont_amps_find_both_orientations ... FAILED test differentiate_artic_v3_from_vss ... FAILED test test_version ... ok test test_insufficient_arguments ... ok test non_matching_primer_sets ... ok test differentiate_vss_from_artic_v3 ... ok test differentiate_vss2_from_vss1a ... FAILED

failures:

---- ont_amps_find_both_orientations stdout ---- thread 'ont_amps_find_both_orientations' panicked at 'Unexpected stdout, failed fn(var) └── var as str:

command="/Users/mcampbell/Desktop/src/ampseer/target/debug/ampseer" "--primer-sets" "/Users/mcampbell/Desktop/src/ampseer/tests/fixtures/vss_18_28.bed.fasta" "--reads" "/Users/mcampbell/Desktop/src/ampseer/tests/fixtures/ont_vss_full_length_amp18rev_amp28for.fastq" code=1 stdout="" stderr="2022-01-25 17:12:39,767 ERROR [ampseer] Could not find reads at "/Users/mcampbell/Desktop/src/ampseer/tests/fixtures/ont_vss_full_length_amp18rev_amp28for.fastq"\nError: Invalid input\n" ', /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/ops/function.rs:227:5 note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

---- differentiate_artic_v3_from_vss stdout ---- thread 'differentiate_artic_v3_from_vss' panicked at 'Unexpected stdout, failed fn(var) └── var as str:

command="/Users/mcampbell/Desktop/src/ampseer/target/debug/ampseer" "--primer-sets" "/Users/mcampbell/Desktop/src/ampseer/tests/fixtures/primer_sets/ARTIC_v3.bed.fasta" "/Users/mcampbell/Desktop/src/ampseer/tests/fixtures/primer_sets/neb_vss1a.bed.fasta" "--reads" "/Users/mcampbell/Desktop/src/ampseer/tests/fixtures/artic_v3.fastq" code=1 stdout="" stderr="2022-01-25 17:12:39,767 ERROR [ampseer] Could not find reads at "/Users/mcampbell/Desktop/src/ampseer/tests/fixtures/artic_v3.fastq"\nError: Invalid input\n" ', /rustc/db9d1b20bba1968c1ec1fc49616d4742c1725b4b/library/core/src/ops/function.rs:227:5

---- differentiate_vss2_from_vss1a stdout ---- thread 'differentiate_vss2_from_vss1a' panicked at 'Unexpected stdout, failed fn(var) └── var as str: unknown

opened by mattsoup 2
Doesn't compile on linux

I'm sure my Rust on linux isn't up-to-date, but I got this:

error: failed to parse manifest at /mnt/home/mcampbell/src/ampseer/Cargo.toml

Caused by: feature edition2021 is required

this Cargo does not support nightly features, but if you switch to nightly channel you can add cargo-features = ["edition2021"] to enable this feature

opened by mattsoup 1
Make some code suggestions

This PR doesn't change the functionality of the code but rather makes some suggestions in line with Rust idioms. The biggest change is in populate_primer_count_hash where let key = if replaces the double-assignment of key and is_some is used at the insert step and then the rewrite of import_primer_sets to reduce use of mut and prefer a more "functional" style to a more "C-like" approach of initialising and then updating a hashmap.

Further down the code identify_primer_sets has an if-chain replaced with a match based on a clippy suggestion. I'm less sure that this is an improvement - it uses match rather than an if-chain, which is more ideomatic for Rust, but the use of Ordering::Equal etc might make things less readable to some users.

opened by pvanheus 0
bioconda package

Hi!

This is a very nice tool.

Once you have enough confidence in it, it would be nice to create a package on bioconda to make it easier to integrate in workflow (e.g. I would like to add it to V-pipe, as we need primer autodetection for several components )

I can give a hand with that if needed.

opened by DrYak 0
handle nextera fragmented libraries

nextera transposases cannot insert at the extreme ends of amplicons. To handle these libraries, it would be necessary to enable partial matches with expected start sequences.

opened by bwlang 1

Owner

New England Biolabs Inc.

GitHub

A simple library with just one struct which is used to wrap around pointers

A simple library with just one struct which is used to wrap around pointers. This can be used to create pointers and share them across threads without the hassle of synchronization if you really do not care about that.

1 Apr 11, 2022

Reads and writes OGG bitstreams.

ogg-bitstream Reads and writes OGG bitstreams. Overview Implements the following specification: rfc3533 Features All the features are enabled by defau

2 Oct 13, 2021

CFD is a tool that allows you to check one or more domains to see if they are protected by CloudFlare or not.

CFD is a tool that allows you to check one or more domains to see if they are protected by CloudFlare or not. The check is carried out based on five criteria: 3 headers in the HTTP response, IP, and SSL certificate issuer. The check result can be displayed on the screen or saved to a file.

13 Apr 7, 2023

osu-link is a program which links osu!stable beatmaps to osu!lazer's new store format, saving you disk space.

2 Nov 8, 2021

A tool using binrw to read FFXIV data files

last-legend-dob A tool using binrw to read FFXIV data files. Mainly made to harvest the music from the game files for personal consumption, since it t

2 Oct 10, 2022

twilight-interactions is a set of macros and utilities to work with Discord Interactions using twilight.

Twilight interactions twilight-interactions is a set of macros and utilities to work with Discord Interactions using twilight. Note: This crate is not

24 Dec 26, 2022

A dynamic output configuration tool that automatically detects and configures connected outputs based on a set of profiles.

shikane A dynamic output configuration tool that automatically detects and configures connected outputs based on a set of profiles. Each profile speci

15 May 4, 2023

kindly is a simple Rust implementation of a set-user-ID-root program, similar to sudo but in a much reduced way.

26 Dec 5, 2022

A Rust proc-macro crate which derives functions to compile and parse back enums and structs to and from a bytecode representation

Bytecode A simple way to derive bytecode for you Enums and Structs. What is this This is a crate that provides a proc macro which will derive bytecode

4 Sep 3, 2022

Simple MHV6 extension used to download custom songs efficiently and effectively.

nong-downloader-extension A simple MegaHack v6 extension that helps you download NONG songs directly to GD. Compiling. Why would you want to compile??

6 Nov 1, 2022

CLI tool that make it easier to perform multiple lighthouse runs towards a single target and output the result in a "plotable" format.

Lighthouse Groupie CLI tool that make it easier to perform multiple lighthouse runs towards a single target and output the result in a "plotable" form

1 Jan 12, 2022

🗃 Decoder and utility for the Flipnote Studios .ppm animation format.

?? para What? A decoder and utility for the Flipnote Studios .ppm animation format. Why this implementation over the original? This implementation is

11 Dec 12, 2022

Rust crate to generate, manipulate and traverse trees.

SOCAREL Rust crate to generate, manipulate and traverse trees. It provides iterators for eight different traversal algorithms. Add and remove nodes in

8 Nov 14, 2021

Rustcat is a port listener that can be used for different purposes.

⚙️ Rustcat ⚙️ -- Basic Netcat Alternative -- About Rustcat is a port listener that can be used for different purposes.

489 Dec 28, 2022

QueingSimulator is an application that can be used to build intuitions about behavior of synchronous request/reply systems

Queueing Simulator QueingSimulator is an application that can be used to build intuitions about behavior of synchronous request/reply systems (such as

7 Sep 11, 2022

Ampseer examines reads in fastq format and identifies which multiplex PCR primer set was used to generate the SARS-CoV-2 sequencing library they are read from.

Related tags

Overview

ampseer

Example Commands:

run the program:

run the tests

make a flamegraph (--root needed on MacOS)

Comments

Failing tests

Doesn't compile on linux

Make some code suggestions

bioconda package

handle nextera fragmented libraries

Owner

New England Biolabs Inc.

A simple library with just one struct which is used to wrap around pointers

Reads and writes OGG bitstreams.

CFD is a tool that allows you to check one or more domains to see if they are protected by CloudFlare or not.

osu-link is a program which links osu!stable beatmaps to osu!lazer's new store format, saving you disk space.

A tool using binrw to read FFXIV data files

twilight-interactions is a set of macros and utilities to work with Discord Interactions using twilight.

A dynamic output configuration tool that automatically detects and configures connected outputs based on a set of profiles.

kindly is a simple Rust implementation of a set-user-ID-root program, similar to sudo but in a much reduced way.

A Rust proc-macro crate which derives functions to compile and parse back enums and structs to and from a bytecode representation

Simple MHV6 extension used to download custom songs efficiently and effectively.

CLI tool that make it easier to perform multiple lighthouse runs towards a single target and output the result in a "plotable" format.

🗃 Decoder and utility for the Flipnote Studios .ppm animation format.

Rust crate to generate, manipulate and traverse trees.

Rustcat is a port listener that can be used for different purposes.

QueingSimulator is an application that can be used to build intuitions about behavior of synchronous request/reply systems

Common processing blocks used with your Runes.

Smart Contracts to be used by terra-bot

Rewrite of the Discord Bot used for Managing the Infinity Bot List Servers.

🤖 Discord bot used @okp4 to bridge the Dataverse!