command line tools for coprolite research (paleontology and archaeology): estimate the producer's body mass based on coprolite diameter by the use of regression models

Overview

OVERVIEW OF COPROSIZE

coprosize employs power, exponential and cubic regression models allowing to estimate the producer's body mass based on coprolite diameter. Models can be chosen accordingly to the supposed producer's taxon (at this stage of program development, only tetrapod models are implemented) and its diet type (carnivorous, herbivorous, omnivorous, unspecified). Implemented regression formulae are provided in Supplement 1. Regression models and constructed based on the data given in Supplement 2. Scat diameters and body masses.

As it is aimed for science, coprosize is written in Rust. The reasons for this choice are (1) the high code correctness guaranteed by Rust, (2) to ensure that each program version will be accessible in the registry 'in perpetuity' and (3) that each program version will remain easily installed and cross-platform 'in perpetuity', thanks to the Rust's strict policy of backwards compatibility.

[keywords (en-AU): animal scats, archaeology, biology, body mass, body size, coprolites, dung, fossil faeces, fossils, geology, ichnology, palaeontology, scientific computing, weight; (en-US): fossil feces, paleontology]

USAGE

[Static link to a changeable image of the most recent version of coprosize! This may include pre-releases!]

help-image

EXAMPLES

[Static links to changeable images of the most recent version of coprosize! This may include pre-releases!]

example-image-1

example-image-2

CITATION AND REUSE

Please, always refer to a specific program version--implemented formulae are subject to change if new data are available (or simply studied by the author) or bugs of any kind are detected. Although coprosize is designed with the needs of a user in mind, you are perfectly OK to use my models in your study without really installing it as long as you cite this computer program as the original source:

Bajdek, P., 2022. coprosize (version 1.0.0). [computer software] https://github.com/piotrbajdek/coprosize

You are also OK to modify and fork coprosize under terms of the MIT license. It is moreover possible to link against coprosize using its library as a dependency for other bioinformatics projects (see public functions). The usage is best explained by example:

Add to your Cargo.toml file:

[dependencies]
coprosize = "1.0.0"

Put in your src/main.rs file:

extern crate coprosize;

fn main() {
let diameter = "9"; // diameter as &str
coprosize::herbivorous_rodentia(diameter);
println!("");
coprosize::omnivorous_rodentia(diameter);
println!("");
println!("Source of the above models:");
println!("");
coprosize::citation();
}

The above program will reuse the internal library of coprosize. Note that 'diameter' must be given as a string slice (of numbers and optionally including a dot)--the library won't work receiving neither an integer nor a floating point.

INSTALLATION ON LINUX

coprosize should run smoothly on Windows and macOS, and can be installed by the use of cargo. Yet, it is being developed and tested on Fedora Linux.

METHOD 1

1. Install from crates.io by the use of cargo:

cargo install coprosize

By default, the file will be downloaded to .cargo/bin/, a hidden folder in your home directory.

2a. For convenience, you will probably want to copy coprosize to /usr/bin/ as in Method 2 (3a, 3b).

2b. Alternatively, add ~/.cargo/bin directory to your PATH variable (see documentation of your shell).

METHOD 2

1. Download the binary 'coprosize' for Linux x86_64 from GitHub:

https://github.com/piotrbajdek/coprosize/releases/tag/v1.0.0

2. Make the file executable:

sudo chmod +x ./coprosize

3a. Install coprosize via copying the binary to /usr/bin/

sudo cp coprosize /usr/bin/

3b. On Fedora Silverblue / Kinoite:

sudo cp coprosize /var/usrlocal/bin/

METHOD 3

Download the coprosize source from GitHub. Then, build and install the program:

https://github.com/piotrbajdek/coprosize/releases/tag/v1.0.0

cargo build --release && sudo cp target/release/coprosize /usr/bin/

COPROSIZE CRATE ON CRATES.IO

The Rust community’s crate registry

https://crates.io/crates/coprosize

You might also like...
⚙️ A curated list of static analysis (SAST) tools for all programming languages, config files, build tools, and more.
⚙️ A curated list of static analysis (SAST) tools for all programming languages, config files, build tools, and more.

This repository lists static analysis tools for all programming languages, build tools, config files and more. The official website, analysis-tools.de

Tool to create web interfaces to command-line tools

webgate This command line utility allows you to: serve files and directories listed in a config file remotely run shell commands listed in a config fi

Miscellaneous command-line tools

Miscellaneous command-line tools Please note that this software is not "open source", but the source is available for use and modification by individu

Command line linguistic tools: display pronunciation, convert between regional norms of orthography

Command line linguistic tools: display pronunciation, convert between regional norms of orthography; support for multiple modern and ancient languages: English, Latin, Polish, Quechua, Tikuna

A toolkit for building your own interactive command-line tools in Rust

promkit A toolkit for building your own interactive command-line tools in Rust, utilizing crossterm. Getting Started Put the package in your Cargo.tom

CLI tool for mass tweet deletion

Damae Damae is a simple CLI tool for mass-deleting tweets. Its name comes from the Latin phrase "damnatio memoriae", which means "condemnation of memo

Detect polymer contaminants in mass spectra.
Detect polymer contaminants in mass spectra.

🦨 mzsniffer 👃 Detect polymer contaminants in mass spectra. Introduction Mzsniffer is a command line application to quickly detect common polymer con

A free and open-source DNA Sequencing/Visualization software for bioinformatics research.
A free and open-source DNA Sequencing/Visualization software for bioinformatics research.

DNArchery 🧬 A free and open-source cross-platform DNA Sequencing/Visualization Software for bioinformatics research. A toolkit for instantly performi

A shell for research papers

Reason: A Shell for Research Papers Did I ever read this paper? Which OSDI 2021 papers did I read? Which ones have the word 'Distributed' in their tit

Comments
  • Improved codebase by deduplicating logic among other things

    Improved codebase by deduplicating logic among other things

    I went through the code base and saw there was a lot of code duplication, particularly with argument handling and colors, so I went ahead and fixed it. A lot has changed so I'll give a brief rundown of whats new.

    Colors

    Every function in this project had something along the lines of this copy pasted at the top

       let clear = "\x1b[0m"; // reset the font color
       let red = "\x1b[31m";
       let bright_yellow = "\x1b[93m";
       let cyan = "\x1b[36m";
    

    which is a massive amount of code duplication given the number of functions in this project. I opted to create a colors module that would handle all text coloring with the colorize() function, including resetting it to the normal color. It would likely make more sense to take advantage of an existing crate such as termcolor or colored for terminal colors instead, but since this crate was made using 0 dependencies I opted to not use them.

    Command line arguments part 1

    I next started handling command line arguments. This is also an area where using an existing crate designed for this like clap or structopt would have helped a lot, but as mentioned before I opted not to use them in this case. I began with the "Arguments anywhere" part, where I saw the "changes" section with a giant unreadable line of what has been implemented. I extracted the contents into an array and used a for loop to print and format them all easily. If you do add more in the future, I'd recommend making a second array such as all_implemented_v2 and copying the for loop to print them all.

    I then saw the hand formatted "help" message and decided to do a similar thing with what I did with the "changes" section. In retrospect the hand formatted version wasn't that bad, but this new version makes it easier see and add things to the lists without worrying about the formatting as it is taken care of for you.

    Command line arguments part 2

    I tend moved to the actual meat of the program, parsing the diet, taxon, diameter, and potential "subgroups" arguments and calling the right functions. I created 3 new types for this, a Diet and Taxon enum and an Args struct to parse and hold the arguments we care about. The Args struct handles diet and taxon being in either order, as well as parsing if we have a subgroups flag and the diameter. And since we have these enums, we can simply match on the (Diet, Taxon) instead of having all these big if input1 == ... && input 2 == ... || ..... for every implemented algorithm. And since we are parsing diameter and the subgroups flag beforehand, we can remove all the copy pasted logic doing that as well.

    Errors

    With diameter being parsed to a f32 from the get go, we can change all the functions in lib.rs to accept it as a f32 instead of as a &str for no reason, which means all of those functions can't panic any more. Speaking of panics, I have changed the code to print nice error messages instead of an ugly panic message when basic expected errors occur, such as being unable to parse diameter or being given an unfamiliar diet/taxon. The messages can be changed to your liking by editing the Display impl for the Error type I made in main.rs.

    Other

    There are some other small changes I made along the way, such as adding a superscript_number(f32) -> String function instead of doing the big .replace(...) chain you had in your original code. This version of the code should function identically for all inputs that were valid in the original code as well. I also ran cargo fmt on the code base to standardize the formatting, which may have messed up some of the hand formatted alignment unfortunately. This also changed the indentation to 4 spaces instead of 3. If you prefer it to be 3 spaces I can change the rustfmt.toml for this project to do tab_spaces = 3

    There are still some improvements that could be made to this code base but this patch brings it to a much better place than it was before and should be Good Enough™ for now

    opened by MarkDDR 2
Owner
Piotr Bajdek
Piotr Bajdek
Visual regression testing of H264 frames and images.

twenty-twenty The twenty-twenty library allows for visual regression testing of H.264 frames and images. It makes it easy to update the contents when

KittyCAD 4 Jul 16, 2023
A few demos showing how to estimate projects using Monte Carlo simulations.

Agile Monte Carlo Simulations Demos This is the repository which accompanies the blog post "How to replace estimations and guesses with a Monte Carlo

Lucas F. da Costa 14 Jun 18, 2022
A system clipboard command line tools which inspired by pbcopy & pbpaste but better to use.

rclip A command line tool which supports copy a file contents to the system clipboard or copy the contents of the system clipboard to a file. Install

yahaa 3 May 30, 2022
Help Skelly to find bones, combine them to build his body back

Bones Collector Help Skelly to find bones, combine them to build his body back! Game made for the bevy Jam#2. Play it here in your browser: itch.io! R

Thomas 2 Nov 15, 2022
Reviving the Research Edition Unix speak command

This repository contains the source code of Unix speak program that appeared in the Third (1973) to Sixth (1975) Research Unix editions, slightly adjusted to run on a modern computer. Details on the code's provenance and the methods employed for reviving it can be found in this blog post.

Diomidis Spinellis 31 Jul 27, 2022
1 library and 2 binary crates to run SSH/SCP commands on a "mass" of hosts in parallel

massh 1 library and 2 binary crates to run SSH/SCP commands on a "mass" of hosts in parallel. The binary crates are CLI and GUI "frontends" for the li

null 2 Oct 16, 2022
Tool for mass import of hosts into Zabbix (and other API functions)

zabbix-tools A CLI tool for interacting with Zabbix API built in Rust. Designed for Zabbix 6.0. Functions added to test API and add hosts manually or

null 1 Apr 21, 2022
Tool and framework for securely reading untrusted USB mass storage devices.

usbsas is a free and open source (GPLv3) tool and framework for securely reading untrusted USB mass storage devices. Description Following the concept

CEA IT Security 250 Aug 16, 2023
Small command-line tool to switch monitor inputs from command line

swmon Small command-line tool to switch monitor inputs from command line Installation git clone https://github.com/cr1901/swmon cargo install --path .

William D. Jones 5 Aug 20, 2022
Call is an easy-to-use command tools for remote development.

Call is an easy-to-use command tools for remote development. It helps you to build remote development easily and elegant. It can work with makefile and justfile.

null 21 Dec 14, 2022