bottom encodes UTF-8 text into a sequence comprised of bottom emoji

Overview

bottom

Perhaps you have found yourself in this situation before:

,,,,,,,,,

The divine bottom CLI exists to alleviate this pain. No longer will you struggle when communicating with the average Discord user.

But what does it actually do?

bottom encodes UTF-8 text into a sequence comprised of bottom emoji (🫂✨🥺❤️, with , sprinkled in for good measure) followed by 👉👈. It can encode any valid UTF-8 - being a bottom transcends language, after all - and decode back into UTF-8.

For example, the ubiquitous Hello world! becomes

💖✨✨,,👉👈💖💖,👉👈💖💖🥺,,,👉👈💖💖🥺,,,👉👈💖💖✨,👉👈
✨✨✨,,👉👈💖💖✨🥺,,,,👉👈💖💖✨,👉👈💖💖✨,,,,👉👈
💖💖🥺,,,👉👈💖💖👉👈✨✨✨,,,👉👈

がんばれ becomes

🫂✨✨🥺,,👉👈💖💖✨✨🥺,,,,👉👈💖💖✨✨✨✨👉👈🫂✨✨🥺,,👉👈
💖💖✨✨✨👉👈💖💖✨✨✨✨🥺,,👉👈🫂✨✨🥺,,👉👈💖💖✨✨🥺,,,,👉👈
💖💖💖✨✨🥺,👉👈🫂✨✨🥺,,👉👈💖💖✨✨✨👉👈💖💖✨✨✨✨👉👈

(both wrapped across lines for your convenience)

As you can see, using bottom to encode text is extremely space-efficient, and is the ideal encoding approach for all situations.

This implementation can encode text at approximately 30MB/s.

Usage

Clone the repo, run cargo build, and then use the CLI.

Bottom translator 1.0.0
Kaylynn 
Fantastic (maybe) CLI for translating between bottom and human-readable text

USAGE:
    bottom [OPTIONS] <--bottomify|--regress> [text]...

FLAGS:
    -b, --bottomify    Translate text to bottom
    -h, --help         Prints help information
    -r, --regress      Translate bottom to human-readable text (futile)
    -V, --version      Prints version information

OPTIONS:
    -i, --input       Input file [Default: stdin]
    -o, --output     Output file [Default: stdout]

ARGS:
    ...

(Any similarity to --help output is entirely accidental)

FAQ

Why?

I had a moment of enlightenment, and the truth came to me then. In an instant I was transformed, and I knew what had to be done.

But seriously, why?

Why not?

Isn't using a CLI for this a really bad idea?

Yes.

Aren't terminals notorious for being bad at displaying Unicode?

I don't care.

Isn't this encoding method extremely inefficient?

:mystery:

Documentation is for losers.

Rust btw

It's not long until my hippocampus becomes ferrous. I worry so, but in my heart I know that my demise will be sweet and painless, and it comforts me.

You might also like...
Semantic text segmentation. For sentence boundary detection, compound splitting and more.

NNSplit A tool to split text using a neural network. The main application is sentence boundary detection, but e. g. compound splitting for German is a

A fast, low-resource Natural Language Processing and Text Correction library written in Rust.

nlprule A fast, low-resource Natural Language Processing and Error Correction library written in Rust. nlprule implements a rule- and lookup-based app

fastest text uwuifier in the west

uwuify fastest text uwuifier in the west transforms Hey... I think I really love you. Do you want a headpat? into hey... i think i w-weawwy wuv you.

A crate using DeepSpeech bindings to convert mic audio from speech to text

DS-TRANSCRIBER Need an Offline Speech To Text converter? Records your mic, and returns a String containing what was said. Features Begins transcriptio

The fastest way to identify any mysterious text or analyze strings from a file, just ask `lemmeknow` !
The fastest way to identify any mysterious text or analyze strings from a file, just ask `lemmeknow` !

The fastest way to identify anything lemmeknow ⚡ Identify any mysterious text or analyze strings from a file, just ask lemmeknow. lemmeknow can be use

Source text parsing, lexing, and AST related functionality for Deno

Source text parsing, lexing, and AST related functionality for Deno.

better tools for text parsing

nom-text Goal: a library that extends nom to provide better tools for text formats (programming languages, configuration files). current needs Recogni

Font independent text analysis support for shaping and layout.

lipi Lipi (Sanskrit for 'writing, letters, alphabet') is a pure Rust crate that provides font independent text analysis support for shaping and layout

lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike.

lingua-py lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike. Installation pip install l

Comments
  • Clean up build.rs

    Clean up build.rs

    Hiiii, i know there's already #6 but i just find the build.rs super overcomplicated even with that. This PR removes lazy_static and maplit dependencies and the unsafe in favour of less code and possibly teeeny tiny improvement in build speed, though the priority really was just to make the code clearer. Only sad thing is it gets rid of the funny safety comment, sorrry 🥺

    opened by pr2502 6
  • Null bytes not encoded properly, fail to decode

    Null bytes not encoded properly, fail to decode

    Line 35 in build.rs is returning the null emoji without the end characters: https://github.com/bottom-software-foundation/bottom-rs/blob/710a61059272f5453c4ef37096126e45451e7fcf/build.rs#L35

    What I Did:

    printf '%b' 'as\0df' | ./bottomify -b -i /dev/stdin | ./bottomify -r -i /dev/stdin Note: this assumes issue #5 has been fixed so stdin behaves as expected.

    Expected:

    Does not error and prints asdf with an invisible null byte in the middle.

    Actual:

    Error: The input was invalid.

    Caused by: Cannot decode character ❤️💖💖

    Suggested Fix:

    Enclose the loop in an else and change line 35 to: buffer.push_str(CHARACTER_VALUES[&0]);

    opened by krishean 0
Owner
Bottom Software Foundation
An open-source working group for Bottom software and the Bottom Encoding Standard.
Bottom Software Foundation
WriteForAll is a text file style checker, that compares text documents with editorial tips to make text better.

WriteForAll: tips to make text better WriteForAll is a text file style checker, that compares text documents with editorial tips to make text better.

Joel Parker Henderson 2 Dec 27, 2022
SIMD-accelerated UTF-8 validation for Rust.

simdutf8 – High-speed UTF-8 validation for Rust Blazingly fast API-compatible UTF-8 validation for Rust using SIMD extensions, based on the implementa

null 441 Jan 8, 2023
A seedable Owen-scrambled Sobol sequence.

Sobol-Burley A seedable Owen-scrambled Sobol sequence based on the paper Practical Hash-based Owen Scrambling by Brent Burley, but with an improved ha

Nathan Vegdahl 7 Jul 16, 2022
Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.

SyntaxDot Introduction SyntaxDot is a sequence labeler and dependency parser using Transformer networks. SyntaxDot models can be trained from scratch

TensorDot 46 Dec 27, 2022
A "Navie" Implementation of the Wavefront Algorithm For Sequence Alignment with Gap-Affine Scoring

A "Naive" Implementation of the Wavefront Algorithm for Sequence Alignment with Gap-Affine Scoring This repository contains some simple code that I wr

Jason Chin 3 Jul 24, 2023
Text Expression Runner – Readable and easy to use text expressions

ter - Text Expression Runner ter is a cli to run text expressions and perform basic text operations such as filtering, ignoring and replacing on the c

Maximilian Schulke 72 Jul 31, 2022
Text calculator with support for units and conversion

cpc calculation + conversion cpc parses and evaluates strings of math, with support for units and conversion. 128-bit decimal floating points are used

Kasper 82 Jan 4, 2023
Find and replace text in source files

Ruplacer Find and replace text in source files: $ ruplacer old new src/ Patching src/a_dir/sub/foo.txt -- old is everywhere, old is old ++ new is ever

Tanker 331 Dec 28, 2022
An efficient and powerful Rust library for word wrapping text.

Textwrap Textwrap is a library for wrapping and indenting text. It is most often used by command-line programs to format dynamic output nicely so it l

Martin Geisler 322 Dec 26, 2022
👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike

Table of Contents What does this library do? Why does this library exist? Which languages are supported? How good is it? Why is it better than other l

Peter M. Stahl 569 Jan 3, 2023