Extracting archives made easy for Rust 🦀

Overview

Decompress

github crates.io docs.rs build status

A library that supports decompression of archives in multiple formats, inspired by ergonomics from Node's decompress.

  • Includes a default stack of decompressors supporting: zip, tar, tar.gz, tar.bz2, tar.xz, tar.zst (zstd compression), ar (Unix Archive)
  • Build your own decompressors and add them
  • Compose a custom stack (exclude compressors, respond to different file extensions)
  • Use cargo features to avoid compiling formats you don't need

Dependency

[dependencies]
decompress = "0.1.0"

Usage

Default use:

decompress::decompress(archive, to, &ExtractOpts::default());

Strip the first component of all paths in the archive (for when you have a wrapper folder you don't need):

decompress::decompress(archive, to, &ExtractOpts{ strip: 1 });

A micro optimization:

let decompressor = decompress::Decompress::default()
// use decompressor
// decompressor.decompress(...)

Build your own stack:

use regex::Regex;
let decompressor = decompress::Decompress::build(vec![decompressors::zip::Zip::build(Some(
    Regex::new(r".*").unwrap(),
))]);
// use decompressor
// decompressor.decompress(...)

Copyright

Copyright (c) 2022 @jondot. See LICENSE for further details.

Comments
  • remove `regex` completely, or add a feature for opt-out of `regex`

    remove `regex` completely, or add a feature for opt-out of `regex`

    On the face of it, there's nothing special regex is doing when detecting file type based on extension. It can be easily replaced with checking the end of the string, or parsing out the path using std only.

    See this feedback: https://github.com/cargo-bins/cargo-binstall/issues/617#issuecomment-1357556443

    enhancement 
    opened by jondot 1
  • Add `filter` and `map` to file paths

    Add `filter` and `map` to file paths

    Suggestion / Feature Request

    Just like in the npm version of decompress: https://github.com/kevva/decompress#filter

    The idea is to let users:

    • filter out items from decompression, in archives that support listing of entries (filter)
    • rename destination paths or file names (map)

    There are two primary challenges:

    1. Accepting a type of such a function in a clean, easy way, through Opts. It can be an Fn signature, or a trait called FilterMap which implements an identity filter and map by default, and then passed through a Box<dyn ..>
    2. Understanding which compressor can support such operations and which cannot (don't fight with those which cannot, just don't implement for those)
    enhancement good first issue 
    opened by jondot 0
  • Add a `bz2`, `xz`, `zstd` standalone decompressors

    Add a `bz2`, `xz`, `zstd` standalone decompressors

    Suggestion / Feature Request

    Add the following decompressors:

    • [x] bz2
    • [ ] xz
    • [ ] zstd

    Which are standalone, only compress a single file, they have no concept of file listing or "archiving" like tar.

    Similar to the recently added gz decompressor: https://github.com/rusty-ferris-club/decompress/commit/651effa9c22ab65e742c936089b04c54c5164d0d

    Important:

    • feature flags should be matching the decompressor and its dependencies
    • the decompressor should come after the tar.<..> decompressor in the decompressor array stack, so that only if it's not a tar.<something> file, then it will decompress.
    • in test suite - create a demo archive and show that decompression works
    • it's OK to copy the gz decompressor to start with
    enhancement good first issue 
    opened by jondot 0
  • Nodejs native bindings with `neon`

    Nodejs native bindings with `neon`

    Suggestion / Feature Request

    Use neon to wrap the main decompress API for node.js. This does not require any knowledge of decompress other than its public API, and the API lend itself to FFI very well.

    • Build an ergonomic Node API (play with Neon, provide a suggestion first)
    • CI build for multi-platform binaries for the binding (see how other projects do it)
    enhancement good first issue 
    opened by jondot 0
  • add a `can_decompress` checker on public API

    add a `can_decompress` checker on public API

    Suggestion / Feature Request

    For a cheap test before decompression, when going over a massive list of files, it may be preferred to test for decompression before attempting it. While there's no difference in performance, the API might be useful for making decisions in other part of program logic for some users.

    enhancement good first issue 
    opened by jondot 0
  • add an option to auto detect decompressor from magic bytes

    add an option to auto detect decompressor from magic bytes

    Suggestion / Feature Request

    Add an option to detect the compressor from magic bytes, instead of extensions.

    This means every decompressor should now have an additional test_magic which applies magic number detection (for relevant archive types) and returns if it's compatible for decompression.

    After this is done we will have two strategies:

    • by extension
    • by magic bytes

    The public API should allow to choose a strategy explicitly, or to order the strategies in "magic, fallback to extension" or "extension, fallback to magic". This may incorporate a small refactor in how decompressors are tested for validity.

    In addition, including magic bytes now opens up the question of "what to do if no decompressors are found?". For example, it's completely viable for no magic bytes to exist, which may mean this is an uncompressed file. If this is an uncompressed file, should we copy it to target folder? some users would prefer that over "no decompressor found". This kind of result type ("no decompressor", "copied", etc.) should be modeled and refactored as well.

    A clever trick is to add a "PassthroughDecompressor" which is added to the stack, and so this one will copy files when no decompressor found.

    enhancement good first issue 
    opened by jondot 0
  • Refactoring: use `anyhow` for all internal error handling and propagation, but use a final `Error` for API

    Refactoring: use `anyhow` for all internal error handling and propagation, but use a final `Error` for API

    Suggestion / Feature Request

    The various decompressors use a "central" error enum, which can convert from named and specific error types. Although this is the recommended and acceptable best practice, it makes for an overly verbose coding of an error ergonomics story.

    anyhow will chain, wrap, errors, and encourage and allow for adding contexts, which is something we need here. the only drawback of using anyhow in libraries is that we cannot expose any anyhow type in the public API.

    This is why we suggest a refactoring: use anyhow for all internal error handling and propagation, but convert it to a final Error for the public API.

    Notes:

    • get familiar with anyhow first, see how using context can be cool and nice for "attaching" error information
    • convert all Result types to anyhow::Result, this is supposed to be easy
    • locate all public API and use an std::Result. this means you need to now convert from anyhow::Result to a standard one. you can use the existing thiserror library to do that
    enhancement good first issue 
    opened by jondot 0
Owner
Rusty Ferris Club
A builders club for people who love building stuff in Rust. We're also helping and teaching each other.
Rusty Ferris Club
A library for extracting #[no_mangle] pub extern "C" functions (https://docs.rust-embedded.org/book/interoperability/rust-with-c.html#no_mangle)

A library for extracting #[no_mangle] pub extern "C" functions In order to expose a function with C binary interface for interoperability with other p

Dmitrii - Demenev 0 Feb 17, 2022
a function programming language for real world applications made in rust

a function programming language for real world applications made in rust

Tanay Pingalkar 6 Jun 12, 2022
A conky-like system monitor made for the sole purpose of teaching myself rust-lang.

Pomky A conky-like system monitor made for the sole purpose of teaching myself rust-lang. It is not as configurable, modular, or feature packed as con

null 3 Nov 17, 2022
The Ribbon Programming Language, made in Rust.

The Ribbon Programming Language (WIP) This language is designed to be quick to write and is heavily inspired by Rust, which is also the language it wa

null 1 Jan 10, 2022
A simple path traversal checker made with Rust. Useful for APIs that serve dynamic files.

Path trav A simple path traversal checker made with Rust. Useful for APIs that serve dynamic files. Note: this is a security tool. If you see somethin

Gátomo 3 Nov 21, 2022
A refactoring kata, made to exercise learnings from Chapter 6 of Refactoring by Martin Fowler

Fighter Dude 2 - Refactoring Kata This is a tiny refactoring kata that I made to exercise some of the refactorings in Chapter 6 of Refactoring by Mart

Declan Hopkins 3 Dec 17, 2021
Unify your game sources in one place and aquire more of them, using modules made by the community.

Project Black Pearl Unify your game sources in one place by using modules made by the community. What is Project Black Pearl? Project Black Pearl (or

Project Black Pearl 8 Jan 15, 2023
Cross-platform GUI for youtube-dl made with Iced

youtube-dl-gui Cross-platform GUI for youtube-dl made with Iced. Installation Before you install this crate, make sure you have youtube-dl and FFmpeg

Hristo Gochev 4 Feb 27, 2023
Rust explained using easy English

Update 22 December 2020: mdBook can be found here. 28 November 2020: Now also available in simplified Chinese thanks to kumakichi! 1 February 2021: No

null 7.3k Jan 3, 2023
Rust library for hardware accelerated drawing of 2D shapes, images, and text, with an easy to use API.

Speedy2D Hardware-accelerated drawing of shapes, images, and text, with an easy to use API. Speedy2D aims to be: The simplest Rust API for creating a

null 223 Dec 26, 2022
VoceChat is a superlight rust written social server. Easy integration to your site/app.

Re-decentralized the Internet through personal cloud computing. VoceChat is the lightest chat server prioritizes private hosting! Easy integratation t

Privoce 134 Feb 22, 2023
Small, clean, easy to use programming language

Thistle A modern, simplistic multi-paradigm language supporting object-oriented features Hello World! import IO object Main def main(): unit

null 7 Apr 13, 2022
derive(Code) simplifies error handling by providing an easy-to-use enumeration of error codes

enum-code Introduction enum-code is a derive macro for enum types. This library generates code that associates error codes with error types. It can be

Bay 5 Jun 14, 2023
Leetcode Solutions in Rust, Advent of Code Solutions in Rust and more

RUST GYM Rust Solutions Leetcode Solutions in Rust AdventOfCode Solutions in Rust This project demostrates how to create Data Structures and to implem

Larry Fantasy 635 Jan 3, 2023
Simple autoclicker written in Rust, to learn the Rust language.

RClicker is an autoclicker written in Rust, written to learn more about the Rust programming language. RClicker was was written by me to learn more ab

null 7 Nov 15, 2022
Rust programs written entirely in Rust

mustang Programs written entirely in Rust Mustang is a system for building programs built entirely in Rust, meaning they do not depend on any part of

Dan Gohman 561 Dec 26, 2022
Rust 核心库和标准库的源码级中文翻译,可作为 IDE 工具的智能提示 (Rust core library and standard library translation. can be used as IntelliSense for IDE tools)

Rust 标准库中文版 这是翻译 Rust 库 的地方, 相关源代码来自于 https://github.com/rust-lang/rust。 如果您不会说英语,那么拥有使用中文的文档至关重要,即使您会说英语,使用母语也仍然能让您感到愉快。Rust 标准库是高质量的,不管是新手还是老手,都可以从中

wtklbm 493 Jan 4, 2023
clone of grep cli written in Rust. From Chapter 12 of the Rust Programming Language book

minigrep is a clone of the grep cli in rust Minigrep will find a query string in a file. To test it out, clone the project and run cargo run body poem

Raunak Singh 1 Dec 14, 2021
Rust-blog - Educational blog posts for Rust beginners

pretzelhammer's Rust blog ?? I write educational content for Rust beginners and Rust advanced beginners. My posts are listed below in reverse chronolo

kirill 5.2k Jan 1, 2023