PGR-TK: Pangenome Research Tool Kit

Related tags

Miscellaneous pgr-tk
Overview

PGR-tk: A PanGenomic Research Took Kit

test_and_build

This repository is a project to provide Python and Rust libraries to facilitate pangenomics analysis. Several algorithms and data structures used for the Peregrine Genome Assembler are useful for Pangenomics analysis as well. This repo takes those algorithms and data structure, combining other handy 3rd party tools to expose them as a library in Python (with Rust code for those computing parts that need performance.)

What is PGR-tk?

Research Preprint:

Multiscale Analysis of Pangenome Enables Improved Representation of Genomic Diversity For Repetitive And Clinically Relevant Genes

PGR-TK provides pangenome assembly management, query and Minimizer Anchored Pangenome (MAP) Graph Generation

Pangenome Data Management and Minimizer Anchored Pangenome Graph Generation

With the MAP graph, we can use the "principal bundle decomposition" to study complicated structure variants and genome re-arragenment in the human populations.

AMY1A Example

Documentation, Usage and Examples

The API documentation is at https://sema4-research.github.io/pgr-tk/

A collection of Jupyter Notebooks are at https://github.com/sema4-Research/pgr-tk-notebooks/

Built Binaries

Check https://github.com/Sema4-Research/pgr-tk/releases

Build

See docker/Dockerfile.build_env-20.04 for a build enviroment under ubuntu 20.04. With the proper build environment, just run bash build.sh to build all.

For example, on a Mac OS with Docker install, you can clone the repository and build a linux binary within an Ubuntu 20.04 Linux distribution as follow:

  1. Build the Docker image for a build environment:
git clone --recursive [email protected]:cschin/pgr-tk.git # clone the repo
cd pgr-tk/docker
ln -s Dockerfile.build_env-20.04 Dockerfile
docker build -t pgr-tk-build .
  1. In the root directory of the repo pgr-tk:

Execute

docker run -it --rm -v $PWD:/wd/pgr-tk pgr-tk-build /bin/bash 
  1. Build the pgr-tk inside the docker container from the image pgr-tk-build
cd /wd/pgr-tk
bash build.sh

The build python wheels will be in target/wheels which can be installed for ubuntun 20.04 python3.8 distribution. You can install it in the pgr-tk-build image as well to test it out.

Bioconda Release for v0.3.4 on Conda 4.14.0

If you have a conda install, you can try this to build an conda environment to use pgr-tk v0.3.4 (on linux only):

conda create -n pgr-tk python=3.8
conda activate pgr-tk
conda install -c bioconda -c conda-forge python_abi libstdcxx-ng=12 libclang13 pgr-tk=0.3.4
You might also like...
A tool using binrw to read FFXIV data files

last-legend-dob A tool using binrw to read FFXIV data files. Mainly made to harvest the music from the game files for personal consumption, since it t

A tool that, like, screams at you when you say like

Dislike Do you, like,... dislike constantly saying "like" as much as I do? Then, like,... you've come the right place! This tool is like EXACTLY what

Cloud-Based Microservice Performance Profiling Tool

Revelio Systems Revelio Systems is a student startup sponsored by UT Austin's Inventors Program in partnership with Trend Micro. Team: Tejas Saboo, So

A tool to calculate mean and standard deviation from multiple tests using PageSpeed Insights API.

psi-sample PSI Test tool is an open source tool to assist web developers that runs Page Speed Insight test manually! Installing To install the psi-tes

🚀 Fleet is the blazing fast build tool for Rust
🚀 Fleet is the blazing fast build tool for Rust

Fleet is the blazing fast build tool for Rust. Compiling with Fleet is up-to 5x faster than with cargo.

A CLI tool to convet Hex color code or RGB to color code, RGB, HSL and color name(if exists)
A CLI tool to convet Hex color code or RGB to color code, RGB, HSL and color name(if exists)

iro -色- A CLI tool to convert the hex color code or RGB to color code, RGB, HSL, color name(if exists, according to jonathantneal/color-names). Usage

A dead-simple tool for working with data in Kafka
A dead-simple tool for working with data in Kafka

ktool - a tool for Kafka ktool is a dead-simple tool for working with data in Kafka: Copy partitions / topics to disk Replay messages Inspect message

A cli tool to prettify stacktrace in json logs.

stack A cli tool to prettify stacktrace in json logs. Why to write this tool? When we read the stacktrace info from json logs, it's hard to read witho

CFD is a tool that allows you to check one or more domains to see if they are protected by CloudFlare or not.
CFD is a tool that allows you to check one or more domains to see if they are protected by CloudFlare or not.

CFD is a tool that allows you to check one or more domains to see if they are protected by CloudFlare or not. The check is carried out based on five criteria: 3 headers in the HTTP response, IP, and SSL certificate issuer. The check result can be displayed on the screen or saved to a file.

Comments
  • Compilation error

    Compilation error

    While trying to update the version of pgr-tk distributed via Bioconda we ran into the following error:

    2022-10-30T12:11:32.0887989Z 12:11:32 BIOCONDA INFO (OUT)    Compiling cargo-xwin v0.10.2
    2022-10-30T12:11:32.2399451Z 12:11:32 BIOCONDA INFO (OUT) error[E0061]: this function takes 4 arguments but 3 arguments were supplied
    2022-10-30T12:11:32.2401411Z 12:11:32 BIOCONDA INFO (OUT)    --> /opt/conda/conda-bld/pgr-tk_1667130798268/_build_env/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-xwin-0.10.2/src/common.rs:238:22
    2022-10-30T12:11:32.2402037Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.2402506Z 12:11:32 BIOCONDA INFO (OUT) 238 |         let pruned = xwin::prune_pkg_list(&pkg_manifest, arches, variants)?;
    2022-10-30T12:11:32.2403808Z 12:11:32 BIOCONDA INFO (OUT)     |                      ^^^^^^^^^^^^^^^^^^^^--------------------------------- an argument of type `bool` is missing
    2022-10-30T12:11:32.2404290Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.2404640Z 12:11:32 BIOCONDA INFO (OUT) note: function defined here
    2022-10-30T12:11:32.2405510Z 12:11:32 BIOCONDA INFO (OUT)    --> /opt/conda/conda-bld/pgr-tk_1667130798268/_build_env/.cargo/registry/src/github.com-1ecc6299db9ec823/xwin-0.2.9/src/lib.rs:263:8
    2022-10-30T12:11:32.2406065Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.2406473Z 12:11:32 BIOCONDA INFO (OUT) 263 | pub fn prune_pkg_list(
    2022-10-30T12:11:32.2406851Z 12:11:32 BIOCONDA INFO (OUT)     |        ^^^^^^^^^^^^^^
    2022-10-30T12:11:32.2407221Z 12:11:32 BIOCONDA INFO (OUT) help: provide the argument
    2022-10-30T12:11:32.2407682Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.2408177Z 12:11:32 BIOCONDA INFO (OUT) 238 |         let pruned = xwin::prune_pkg_list(&pkg_manifest, arches, variants, /* bool */)?;
    2022-10-30T12:11:32.2408797Z 12:11:32 BIOCONDA INFO (OUT)     |                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    2022-10-30T12:11:32.2409214Z 12:11:32 BIOCONDA INFO (OUT) 
    2022-10-30T12:11:32.3999345Z 12:11:32 BIOCONDA INFO (OUT) error[E0308]: mismatched types
    2022-10-30T12:11:32.4000944Z 12:11:32 BIOCONDA INFO (OUT)    --> /opt/conda/conda-bld/pgr-tk_1667130798268/_build_env/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-xwin-0.10.2/src/common.rs:280:47
    2022-10-30T12:11:32.4001552Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.4002057Z 12:11:32 BIOCONDA INFO (OUT) 280 |                 ProgressBar::with_draw_target(0, draw_target.into()).with_prefix(prefix).with_style(
    2022-10-30T12:11:32.4002914Z 12:11:32 BIOCONDA INFO (OUT)     |                 ----------------------------- ^ expected enum `Option`, found integer
    2022-10-30T12:11:32.4003375Z 12:11:32 BIOCONDA INFO (OUT)     |                 |
    2022-10-30T12:11:32.4003791Z 12:11:32 BIOCONDA INFO (OUT)     |                 arguments to this function are incorrect
    2022-10-30T12:11:32.4004185Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.4004877Z 12:11:32 BIOCONDA INFO (OUT)     = note: expected enum `Option<u64>`
    2022-10-30T12:11:32.4005278Z 12:11:32 BIOCONDA INFO (OUT)                found type `{integer}`
    2022-10-30T12:11:32.4005705Z 12:11:32 BIOCONDA INFO (OUT) note: associated function defined here
    2022-10-30T12:11:32.4006643Z 12:11:32 BIOCONDA INFO (OUT)    --> /opt/conda/conda-bld/pgr-tk_1667130798268/_build_env/.cargo/registry/src/github.com-1ecc6299db9ec823/indicatif-0.17.1/src/progress_bar.rs:52:12
    2022-10-30T12:11:32.4007219Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.4007923Z 12:11:32 BIOCONDA INFO (OUT) 52  |     pub fn with_draw_target(len: Option<u64>, draw_target: ProgressDrawTarget) -> ProgressBar {
    2022-10-30T12:11:32.4008426Z 12:11:32 BIOCONDA INFO (OUT)     |            ^^^^^^^^^^^^^^^^
    2022-10-30T12:11:32.4008820Z 12:11:32 BIOCONDA INFO (OUT) help: try wrapping the expression in `Some`
    2022-10-30T12:11:32.4009206Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.4009733Z 12:11:32 BIOCONDA INFO (OUT) 280 |                 ProgressBar::with_draw_target(Some(0), draw_target.into()).with_prefix(prefix).with_style(
    2022-10-30T12:11:32.4010273Z 12:11:32 BIOCONDA INFO (OUT)     |                                               +++++ +
    2022-10-30T12:11:32.4010625Z 12:11:32 BIOCONDA INFO (OUT) 
    2022-10-30T12:11:32.5079552Z 12:11:32 BIOCONDA INFO (OUT) error[E0308]: mismatched types
    2022-10-30T12:11:32.5081136Z 12:11:32 BIOCONDA INFO (OUT)    --> /opt/conda/conda-bld/pgr-tk_1667130798268/_build_env/.cargo/registry/src/github.com-1ecc6299db9ec823/cargo-xwin-0.10.2/src/common.rs:319:57
    2022-10-30T12:11:32.5081718Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.5082195Z 12:11:32 BIOCONDA INFO (OUT) 319 |         let manifest_pb = ProgressBar::with_draw_target(0, dt.into())
    2022-10-30T12:11:32.5083310Z 12:11:32 BIOCONDA INFO (OUT)     |                           ----------------------------- ^ expected enum `Option`, found integer
    2022-10-30T12:11:32.5083763Z 12:11:32 BIOCONDA INFO (OUT)     |                           |
    2022-10-30T12:11:32.5084188Z 12:11:32 BIOCONDA INFO (OUT)     |                           arguments to this function are incorrect
    2022-10-30T12:11:32.5084585Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.5084944Z 12:11:32 BIOCONDA INFO (OUT)     = note: expected enum `Option<u64>`
    2022-10-30T12:11:32.5085362Z 12:11:32 BIOCONDA INFO (OUT)                found type `{integer}`
    2022-10-30T12:11:32.5085790Z 12:11:32 BIOCONDA INFO (OUT) note: associated function defined here
    2022-10-30T12:11:32.5086695Z 12:11:32 BIOCONDA INFO (OUT)    --> /opt/conda/conda-bld/pgr-tk_1667130798268/_build_env/.cargo/registry/src/github.com-1ecc6299db9ec823/indicatif-0.17.1/src/progress_bar.rs:52:12
    2022-10-30T12:11:32.5087288Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.5087992Z 12:11:32 BIOCONDA INFO (OUT) 52  |     pub fn with_draw_target(len: Option<u64>, draw_target: ProgressDrawTarget) -> ProgressBar {
    2022-10-30T12:11:32.5088479Z 12:11:32 BIOCONDA INFO (OUT)     |            ^^^^^^^^^^^^^^^^
    2022-10-30T12:11:32.5088883Z 12:11:32 BIOCONDA INFO (OUT) help: try wrapping the expression in `Some`
    2022-10-30T12:11:32.5089269Z 12:11:32 BIOCONDA INFO (OUT)     |
    2022-10-30T12:11:32.5089735Z 12:11:32 BIOCONDA INFO (OUT) 319 |         let manifest_pb = ProgressBar::with_draw_target(Some(0), dt.into())
    2022-10-30T12:11:32.5090249Z 12:11:32 BIOCONDA INFO (OUT)     |                                                         +++++ +
    

    I imagine that there was an API change in one of the dependencies.

    opened by dpryan79 3
  • 0.4.0-dev

    0.4.0-dev

    Some major changes

    1. New frg storage backend support. This can replace the agc format for long-read analysis
    2. AGC read optimization (support prefetching mode and can specific the number of parallel readers when building the midx file.)
    3. Replace libwfa with WFA2-lib
    opened by cschin 0
  • pgrtk-0.3.4-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform

    pgrtk-0.3.4-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform

    Hello,

    Can you please list which Linux distributions the wheels support? I have tried on both Ubuntu 20.04 and CentOS7, and both give the error of "not a supported wheel on this platform" when trying to pip install?

    Thank you!

    opened by wharvey31 2
Releases(v0.3.6)
Owner
null
A rust-version of NVIDIA BlueField DOCA kit.

Rust-DOCA Rust API wrapper for the NVIDIA DOCA SDK. The NVIDIA DOCA SDK enables developers to rapidly create applications and services on top of NVIDI

null 4 Jun 15, 2023
🛠️ | System tool for all my lovers

<<<<<<< HEAD ?? love-fetch Original code ferris-fetch ?? ??️ System tool for all my lovers ?? How to install it ?? cargo install love-fetch License ??

Fabio Grimaldi 11 Jun 13, 2021
Tool to make Solus packaging even easier.

A small tool to make packaging for Solus even easier. Features Remove all packages from the local solbuild repo.

null 5 Oct 19, 2022
A library and tool for automata and formal languages, inspired by JFLAP

Sugarcubes is a library and application for automata and formal languages. It is inspired by JFLAP, and is intended to eventually to be an alternative to JFLAP.

Henry Sloan 22 Nov 2, 2022
Benchmarking web frameworks written in rust with rewrk tool.

Web Framework Benchmarks Benchmarking web frameworks written in rust with rewrk tool.

null 103 Dec 8, 2022
Schema2000 is a tool that parses exsiting JSON documents and tries to derive a JSON schema from these documents.

Schema 2000 Schema2000 is a tool that parses exsiting JSON documents and tries to derive a JSON schema from these documents. Currently, Schema2000 is

REWE Digital GmbH 12 Dec 6, 2022
Lightweight tool for simple deployment (server+client)

deploy Lightweight tool for simple deployment (server+client) Usage You first need a key value pair: deploy generate-keys Public-Key: Used on the serv

Jan-Mirko Otter 0 Dec 27, 2021
CLI tool that make it easier to perform multiple lighthouse runs towards a single target and output the result in a "plotable" format.

Lighthouse Groupie CLI tool that make it easier to perform multiple lighthouse runs towards a single target and output the result in a "plotable" form

Polestar 1 Jan 12, 2022
A cog-like tool, written in Rust.

Corg A cog-like tool, written in Rust. The primary difference between Cog and Corg is how Corg executes code blocks. Being written in Rust, it cannot

null 1 Jan 19, 2022
A crude tool to convert from Bitwarden json format to lastpass

Pass-Merger A crude tool to convert from Bitwarden json format to lastpass. Steps to Deduplicate entries from Bitwarden Export all passwords from Chro

Usairim Isani 1 Jan 20, 2022