Bam Error Stats Tool (best): analysis of error types in aligned reads.

Overview

best

Bam Error Stats Tool (best): analysis of error types in aligned reads.

best is used to assess the quality of reads after aligning them to a reference assembly.

Features

  • Collect overall and per alignment stats
  • Distribution of indel lengths
  • Yield at different empirical Q-value thresholds
  • Bin per read stats to easily examine the distribution of errors for certain types of reads
  • Stats for regions specified by intervals (BED file, homopolymer regions, windows etc.)
  • Stats for quality scores vs empirical Q-values
  • Multithreading for speed

Usage

The best Usage Guide gives an overview of how to use best.

Development

Running

  1. Install Rust.
  2. Run cargo build --release.
  3. Run cargo run --release -- input.bam reference.fasta prefix/path or target/release/best input.bam reference.fasta prefix/path.

This will generate stats files with the prefix/path prefix.

The built binary is located at target/release/best.

Formatting

cargo fmt

Comparing

Remember to pass the -t 1 option to ensure that only one thread is used for testing. Best generally tries to ensure the order of outputs is deterministic with multiple threads, but the order of per-alignment stats is arbitrary unless only one thread is used.

Disclaimer

This is not an official Google product.

The code is not intended for use in any clinical settings. It is not intended to be a medical device and is not intended for clinical use of any kind, including but not limited to diagnosis or prognosis.

No representations or warranties are made with regards to the accuracy of results generated. User or licensee is responsible for verifying and validating accuracy when using this tool.

You might also like...
Shellcheck - a static analysis tool for shell scripts
Shellcheck - a static analysis tool for shell scripts

ShellCheck - A shell script static analysis tool ShellCheck is a GPLv3 tool that gives warnings and suggestions for bash/sh shell scripts: The goals o

This tool was developed as part of a course on forensic analysis and cybersecurity. It is intended to be used as a training resource to help students understand the structure and content of job files in Windows environments.

Job File Parser Job File Parser is a Rust-based tool designed for parsing both legacy binary job files and modern XML job files used by the Windows Ta

A tool for determining file types, an alternative to file

file-rs a tool for determining file types, an alternative to file whats done determining file extension determining file type determining file's mime

Oxygen is a voice journal and audio analysis toolkit for people who want to change the way their voice comes across.

Oxygen Voice Journal Oxygen is a voice journal and audio analysis toolkit for people who want to change the way their voice comes across. Or rather, i

⚙️ A curated list of static analysis (SAST) tools for all programming languages, config files, build tools, and more.
⚙️ A curated list of static analysis (SAST) tools for all programming languages, config files, build tools, and more.

This repository lists static analysis tools for all programming languages, build tools, config files and more. The official website, analysis-tools.de

A modern high-performance open source file analysis library for automating localization tasks

🧛 Filecount Filecount is a modern high-performance open source file analysis library for automating localization tasks. It enables you to add file an

Captures packets and streams them to other devices. Built for home network analysis and A&D CTFs.

🍩 shiny-donut shiny-donut is a packet capture app that supports streaming packets from a remote system to another device. The main use for this is to

Nodium is an easy-to-use data analysis and automation platform built using Rust, designed to be versatile and modular.
Nodium is an easy-to-use data analysis and automation platform built using Rust, designed to be versatile and modular.

Nodium is an easy-to-use data analysis and automation platform built using Rust, designed to be versatile and modular. Nodium aims to provide a user-friendly visual node-based interface for various tasks.

some AV / EDR / analysis studies
some AV / EDR / analysis studies

binary some AV / EDR / analysis related experiences fault_test: trigger a access violation, catch with a custom handler and continue the normal execut

Comments
  • Issue with 'index out of bounds: the len is 0 but the index is 0'

    Issue with 'index out of bounds: the len is 0 but the index is 0'

    Hello,

    Thanks for sharing your work, the tool looks great. I have the following issue on a MacBook Air M2. I've little experience with rust so I can't diagnose further. Could be coming from M2 incompatibility or my command/input files...

    $ target/release/best --threads 1 ~/Desktop/mappings.bam ~/References/reference.fa ~/Desktop/best_test
    thread '<unnamed>' panicked at 'index out of bounds: the len is 0 but the index is 0', /Users/<user>/.cargo/registry/src/github.com-1ecc6299db9ec823/noodles-core-0.8.0/src/position/sequence_index.rs:42:10
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    thread '<unnamed>' panicked at 'index out of bounds: the len is 0 but the index is 0', /Users/<user>/.cargo/registry/src/github.com-1ecc6299db9ec823/noodles-core-0.8.0/src/position/sequence_index.rs:42:10
    
    $ export RUST_BACKTRACE=full
    $ target/release/best --threads 1 ~/Desktop/mappings.bam ~/References/reference.fa ~/Desktop/best_test
    thread '<unnamed>' panicked at 'index out of bounds: the len is 0 but the index is 0', /Users/<user>/.cargo/registry/src/github.com-1ecc6299db9ec823/noodles-core-0.8.0/src/position/sequence_index.rs:42:10
    stack backtrace:
       0:        0x102f8d9e4 - std::backtrace_rs::backtrace::libunwind::trace::hbe97f4303076f99a
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
       1:        0x102f8d9e4 - std::backtrace_rs::backtrace::trace_unsynchronized::hb462a91ac605b16b
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
       2:        0x102f8d9e4 - std::sys_common::backtrace::_print_fmt::h555c176cf2344151
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:65:5
       3:        0x102f8d9e4 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::he84d5e26d26c914d
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:44:22
       4:        0x102fa5538 - core::fmt::write::h07c7e4739f78f3b0
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/fmt/mod.rs:1209:17
       5:        0x102f8b104 - std::io::Write::write_fmt::h4a3c5f094ffc65e8
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/io/mod.rs:1682:15
       6:        0x102f8d7f8 - std::sys_common::backtrace::_print::h03a1c9ac6001091a
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:47:5
       7:        0x102f8d7f8 - std::sys_common::backtrace::print::h6fc2cc8be2392a8e
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:34:9
       8:        0x102f8f038 - std::panicking::default_hook::{{closure}}::h3bb44ba6ff437e23
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:267:22
       9:        0x102f8ed90 - std::panicking::default_hook::h92c6668a1a98ae8c
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:286:9
      10:        0x102f8f674 - std::panicking::rust_panic_with_hook::h9b8acc949e1fb807
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:688:13
      11:        0x102f8f484 - std::panicking::begin_panic_handler::{{closure}}::hfc38b2cc56e3623c
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:579:13
      12:        0x102f8de4c - std::sys_common::backtrace::__rust_end_short_backtrace::hbafd151207fc448f
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:137:18
      13:        0x102f8f1d8 - rust_begin_unwind
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
      14:        0x102fb6df4 - core::panicking::panic_fmt::h8c7f3d540a75578f
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
      15:        0x102fb6ed8 - core::panicking::panic_bounds_check::h7a933820625ca023
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:151:5
      16:        0x102ee16f8 - best::stats::AlnStats::from_record::h891032a21f46ba53
      17:        0x102f0ada4 - <rayon::iter::for_each::ForEachConsumer<F> as rayon::iter::plumbing::Folder<T>>::consume::hc5fa0a0571f7ebb2
      18:        0x102ee3db0 - <rayon::iter::par_bridge::IterParallelProducer<Iter> as rayon::iter::plumbing::UnindexedProducer>::fold_with::h68648791be95e8e1
      19:        0x102f028d0 - rayon::iter::plumbing::bridge_unindexed_producer_consumer::h0a6c957847b1d4da
      20:        0x102f03800 - <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::ha25a46a5a0aba9a6
      21:        0x102f09d6c - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::h9a3407efc1e3497d
      22:        0x102fb0bb8 - rayon_core::registry::WorkerThread::wait_until_cold::h693aa9682dd39976
      23:        0x102f2a340 - rayon_core::registry::ThreadBuilder::run::h5448aedadc297cad
      24:        0x102f28b40 - std::sys_common::backtrace::__rust_begin_short_backtrace::h0f8e6945afdf1c1f
      25:        0x102f2d174 - core::ops::function::FnOnce::call_once{{vtable.shim}}::he294d1cfe1066b57
      26:        0x102f9170c - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h3fa970363cd53bf7
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
      27:        0x102f9170c - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h59336ad52adbfea0
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
      28:        0x102f9170c - std::sys::unix::thread::Thread::new::thread_start::h6a9bbfaea09dc826
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys/unix/thread.rs:108:17
      29:        0x1a798826c - __pthread_deallocate
    thread '<unnamed>' panicked at 'index out of bounds: the len is 0 but the index is 0', /Users/<user>/.cargo/registry/src/github.com-1ecc6299db9ec823/noodles-core-0.8.0/src/position/sequence_index.rs:42:10
    stack backtrace:
       0:        0x102f8d9e4 - std::backtrace_rs::backtrace::libunwind::trace::hbe97f4303076f99a
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
       1:        0x102f8d9e4 - std::backtrace_rs::backtrace::trace_unsynchronized::hb462a91ac605b16b
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
       2:        0x102f8d9e4 - std::sys_common::backtrace::_print_fmt::h555c176cf2344151
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:65:5
       3:        0x102f8d9e4 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::he84d5e26d26c914d
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:44:22
       4:        0x102fa5538 - core::fmt::write::h07c7e4739f78f3b0
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/fmt/mod.rs:1209:17
       5:        0x102f8b104 - std::io::Write::write_fmt::h4a3c5f094ffc65e8
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/io/mod.rs:1682:15
       6:        0x102f8d7f8 - std::sys_common::backtrace::_print::h03a1c9ac6001091a
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:47:5
       7:        0x102f8d7f8 - std::sys_common::backtrace::print::h6fc2cc8be2392a8e
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:34:9
       8:        0x102f8f038 - std::panicking::default_hook::{{closure}}::h3bb44ba6ff437e23
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:267:22
       9:        0x102f8ed90 - std::panicking::default_hook::h92c6668a1a98ae8c
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:286:9
      10:        0x102f8f674 - std::panicking::rust_panic_with_hook::h9b8acc949e1fb807
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:688:13
      11:        0x102f8f484 - std::panicking::begin_panic_handler::{{closure}}::hfc38b2cc56e3623c
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:579:13
      12:        0x102f8de4c - std::sys_common::backtrace::__rust_end_short_backtrace::hbafd151207fc448f
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys_common/backtrace.rs:137:18
      13:        0x102f8f1d8 - rust_begin_unwind
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
      14:        0x102fb6df4 - core::panicking::panic_fmt::h8c7f3d540a75578f
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
      15:        0x102fb6ed8 - core::panicking::panic_bounds_check::h7a933820625ca023
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:151:5
      16:        0x102ee16f8 - best::stats::AlnStats::from_record::h891032a21f46ba53
      17:        0x102f0ada4 - <rayon::iter::for_each::ForEachConsumer<F> as rayon::iter::plumbing::Folder<T>>::consume::hc5fa0a0571f7ebb2
      18:        0x102ee3db0 - <rayon::iter::par_bridge::IterParallelProducer<Iter> as rayon::iter::plumbing::UnindexedProducer>::fold_with::h68648791be95e8e1
      19:        0x102f028d0 - rayon::iter::plumbing::bridge_unindexed_producer_consumer::h0a6c957847b1d4da
      20:        0x102f09fac - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::haac1f5ff8a193dbb
      21:        0x102fb0bb8 - rayon_core::registry::WorkerThread::wait_until_cold::h693aa9682dd39976
      22:        0x102fb1134 - rayon_core::join::join_recover_from_panic::hd9813d47c1f54ca4
      23:        0x102f03904 - <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::ha25a46a5a0aba9a6
      24:        0x102f09d6c - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::h9a3407efc1e3497d
      25:        0x102fb0bb8 - rayon_core::registry::WorkerThread::wait_until_cold::h693aa9682dd39976
      26:        0x102f2a340 - rayon_core::registry::ThreadBuilder::run::h5448aedadc297cad
      27:        0x102f28b40 - std::sys_common::backtrace::__rust_begin_short_backtrace::h0f8e6945afdf1c1f
      28:        0x102f2d174 - core::ops::function::FnOnce::call_once{{vtable.shim}}::he294d1cfe1066b57
      29:        0x102f9170c - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h3fa970363cd53bf7
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
      30:        0x102f9170c - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h59336ad52adbfea0
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/boxed.rs:1987:9
      31:        0x102f9170c - std::sys::unix::thread::Thread::new::thread_start::h6a9bbfaea09dc826
                                   at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/sys/unix/thread.rs:108:17
      32:        0x1a798826c - __pthread_deallocate
    

    Best,

    Alan

    opened by touala 4
  • Release and compile problems

    Release and compile problems

    Hi,

    • is this just for short reads or can I analyze long read bams too ?
    • I had problems with the released binary complaining about GLIBs on two different systems ( 1 ubuntu 1804 - too old, fair enough), but also one ubuntu 20.04.
    • So tried compiling myself - this doesn't look good. Is my rustc/cargo too old ?
    rustc -V
    rustc 1.64.0 (a55dd71d5 2022-09-19)
    
    
    git clone ....
    cd best
    cargo install --path .
    
       Compiling noodles-fasta v0.13.0
       Compiling noodles-csi v0.9.1
       Compiling noodles-sam v0.18.0
       Compiling rayon v1.6.1
    error[E0308]: mismatched types
       --> /home/hpc/davenpor/.cargo/registry/src/github.com-1ecc6299db9ec823/noodles-sam-0.18.0/src/reader/query.rs:70:62
        |
    70  | ...                   if let Err(e) = self.reader.seek(chunk.start()) {
        |                                                   ---- ^^^^^^^^^^^^^ expected struct `VirtualPosition`, found struct `noodles_bgzf::virtual_position::VirtualPosition`
        |                                                   |
        |                                                   arguments to this function are incorrect
        |
        = note: perhaps two different versions of crate `noodles_bgzf` are being used?
    note: associated function defined here
       --> /home/hpc/davenpor/.cargo/registry/src/github.com-1ecc6299db9ec823/noodles-sam-0.18.0/src/reader.rs:263:12
        |
    263 |     pub fn seek(&mut self, pos: bgzf::VirtualPosition) -> io::Result<bgzf::VirtualPosition> {
        |            ^^^^ ---------  --------------------------
    
    error[E0308]: mismatched types
      --> /home/hpc/davenpor/.cargo/registry/src/github.com-1ecc6299db9ec823/noodles-sam-0.18.0/src/reader/query.rs:74:41
       |
    74 | ...                   State::Read(chunk.end())
       |                       ----------- ^^^^^^^^^^^ expected struct `VirtualPosition`, found struct `noodles_bgzf::virtual_position::VirtualPosition`
       |                       |
       |                       arguments to this enum variant are incorrect
       |
       = note: perhaps two different versions of crate `noodles_bgzf` are being used?
    note: tuple variant defined here
      --> /home/hpc/davenpor/.cargo/registry/src/github.com-1ecc6299db9ec823/noodles-sam-0.18.0/src/reader/query.rs:15:5
       |
    15 |     Read(bgzf::VirtualPosition),
       |     ^^^^
    
    For more information about this error, try `rustc --explain E0308`.
    error: could not compile `noodles-sam` due to 2 previous errors
    warning: build failed, waiting for other jobs to finish...
    error: failed to compile `best v0.1.0 (/home/hpc/davenpor/programs/best)`, intermediate artifacts can be found at `/home/hpc/davenpor/programs/best/target`
    
    
    opened by colindaven 3
Releases(0.1.0)
Owner
Google
Google ❤️ Open Source
Google
This library provides a convenient derive macro for the standard library's std::error::Error trait.

derive(Error) This library provides a convenient derive macro for the standard library's std::error::Error trait. [dependencies] therror = "1.0" Compi

Sebastian Thiel 5 Oct 23, 2023
Alternative to *fetch, uwuifies all stats.

owofetch-rs Alternative to *fetch, uwuifies all stats. Installation: Arch: AUR Other Linux distros: Either compile the source with cargo build --relea

nett_hier 6 Dec 26, 2022
A simple TTS tool for Windows that reads directly from the clipboard.

Quick Text-To-Speech A simple TTS tool for Windows that reads directly from the clipboard or from textfiles that are dragged into the window. Screensh

Alexander 3 May 1, 2023
ABQ is a universal test runner that runs test suites in parallel. It’s the best tool for splitting test suites into parallel jobs locally or on CI

?? abq.build   ?? @rwx_research   ?? discord   ?? documentation ABQ is a universal test runner that runs test suites in parallel. It’s the best tool f

RWX 13 Apr 7, 2023
Extract subsets of ONT (Nanopore) reads based on time

ONTime Extract subsets of ONT (Nanopore) reads based on time Motivation Install Examples Usage Time range format Cite Motivation Some collaborators wa

Michael Hall 5 Jan 17, 2023
Tumour-only somatic mutation calling using long reads

smrest smrest is a prototype somatic mutation caller for single molecule long reads. It uses haplotype phasing patterns for tumour samples that have a

Jared Simpson 16 Mar 1, 2024
being Ariel's best friend!

Sebastian se·bas·tian - sɪˈbæstɪən A simple tool used to access UniMi services -- mainly ariel, but not only -- via CLI. Important: state First of all

null 2 Sep 2, 2022
Solutions for exact and optimized best housing chains in BDO using popjumppush and MIP.

Work in progress. About This project is an implementation of the pop_jump_push algorithm. It uses graph data from the MMORPG Black Desert Online's tow

Thell 'Bo' Fowler 3 May 2, 2023
Best-effort attempt at rewriting Reticulum in Rust.

reticulum-rs WIP rewrite of Reticulum in Rust explicitly targeted at the ESP32 (no tokio, avoids memory-inefficient data structures. etc). Not ready f

Ellen Poe 8 Sep 25, 2023
An implementation of ksplang, the best programming language

A stack-based programming language with 33 instructions, independently designed by 33 people. Introduction This is an implementation of ksplang, the b

KSP MFF UK 8 Feb 24, 2024