Brings all relevant command-line tesseract functionality to Rust

Overview

rusty-tesseract

A Rust wrapper for Google Tesseract

GitHub last commit GitHub Stars Forks Github All Releases Issues

Description

  • Brings all relevant command-line tesseract functionality to Rust
  • Based on the Python wrapper for tesseract (i.e. https://github.com/madmaze/pytesseract)
  • Enables testing a pre-trained tesseract model and outputting the results in different formats such as strings, bounding boxes, dicts, or dataframes.

Dependencies

Tesseract: https://github.com/tesseract-ocr/tesseract

Usage

1. Read Image

Create an Image object by specifying a path or alternatively an image array in (height, width, channel) format (similar to Python's numpy array for opencv). Note: Leave the Array3 parameter as is if you don't intend to use it.

::zeros((100, 100, 3)) ); // alternatively instantiate directly: let mut img = Image { path: String::from("img/string.png"), ndarray: Array3:: ::zeros((100, 100, 3)) // example: creates an 100x100 pixel image with 3 colour channels (RGB) };">
let mut img = Image::new(
    String::from("img/string.png"),
    Array3::<u8>::zeros((100, 100, 3))
);

// alternatively instantiate directly:

let mut img = Image {
    path: String::from("img/string.png"),
    ndarray: Array3::<u8>::zeros((100, 100, 3))  // example: creates an 100x100 pixel image with 3 colour channels (RGB)
};

2. Set tesseract parameters

Set tesseract parameters using the Args struct.

let default_args = Args::new();

// the default parameters are
/* pub fn new() -> Args {
        Args {
            config: HashMap::new(),
            lang: "eng",
            out_filename: "out",
            dpi: 150,
            boxfile: false
            }
    }
*/

// fill your own argument struct if needed
let mut my_args = Args {
    out_filename: "out",        // name of output_file
    lang: "eng",                // model language (tesseract default = 'eng')
    config: HashMap::new(),     // create empty hashmap to fill with command line parameters such as --psm or --oem (see tesseract --help-extra)
    dpi: 150,                   // specify DPI for input image
    boxfile: false              // specify whether the output should be a bounding box or string output
};
image_to_string_args.config.insert("psm", "6");  // define page segmentation mode 6 (i.e. "Assume a single uniform block of text")
image_to_string_args.config.insert("oem", "3");  // define optical character recognition mode 3 (i.e. "Default, based on what is available")

3. Get the tesseract model output

Choose either string, bounding box or data output:

// string output
let output = rusty_tesseract::image_to_string(&img, my_args);
    println!("The String output is: {:?}", output.Output_STRING);

// define bounding box parameters
let mut image_to_boxes_args = Args {
    out_filename: "font_name.font.exp0",
    lang: "eng",
    config: HashMap::new(),
    dpi: 150,
    boxfile: true
};
image_to_boxes_args.config.insert("psm", "6");
image_to_boxes_args.config.insert("oem", "3");

// boxes printed in OUTPUT_DICT or OUTPUT_DATAFRAME format store the key as a string (i.e. the character) and 
// store the value as a list of strings (if the same character occurs more than once)
let boxes = rusty_tesseract::image_to_boxes(&img, image_to_boxes_args);
println!("The Boxfile output is: {:?}", boxes.Output_DATAFRAME);

// image_to_data prints out both the "image_to_string()" and "image_to_boxes()" information + a creates a TSV table with confidences
let data = rusty_tesseract::image_to_data(&img, default_args);
println!("The data output is: {:?}", data.Output_DICT);

Get tesseract version

let tesseract_version = rusty_tesseract::get_tesseract_version();
println!("The tesseract version is: {:?}", tesseract_version);

Contributing

  1. Fork the repository
  2. Create a new feature branch (git checkout -b my-feature-branch-name)
  3. Commit your new changes (git commit -m 'commit message' )
  4. Push changes to the branch (git push origin my-feature-branch-name)
  5. Create a Pull Request
You might also like...
tai (Terminal Ascii Image) tool to convert images to ascii written in Rust
tai (Terminal Ascii Image) tool to convert images to ascii written in Rust

TAI Terminal Ascii Image A tool to convert images to ascii art written in Rust πŸ¦€ Notes This tool is still in development stage. Contributions All Con

🎨 Example-based texture synthesis written in Rust πŸ¦€
🎨 Example-based texture synthesis written in Rust πŸ¦€

🎨 texture-synthesis A light Rust API for Multiresolution Stochastic Texture Synthesis [1], a non-parametric example-based algorithm for image generat

Face detection library for the Rust programming language
Face detection library for the Rust programming language

Rustface SeetaFace detection library for the Rust programming language Example of demo program output SEETAFACE C++ – Github repository for the origin

Rust CV mono-repo

Rust CV Rust CV is a project to implement computer vision algorithms, abstractions, and systems in Rust. #[no_std] is supported where possible. Docume

Visual Odometry in Rust (vors)

Visual Odometry in Rust (vors) This repository provides both a library ("crate" as we say in Rust) named visual-odometry-rs, (shortened vors) and a bi

A Telegram Bot written in Rust to Track new Github releases

Release Tracker This is used to track Releases posted on GitHub Releases and Post it on a Telegram Channel/Group. Setup Export env variables or just f

A Simple-to-use, cross-platform Rust Webcam Capture Library

Cross Platform Rust Library for powerful Webcam Capture and Virtual Webcams

Classical Rainbow Triangle using Rust and Vulkan via vulkano bindings
Classical Rainbow Triangle using Rust and Vulkan via vulkano bindings

Vulkano Rainbow Triangle Classical Rainbow Triangle using Rust and Vulkan via vulkano bindings. Based on the vulkano triangle example Quick Start $ ca

A simple image average color extractor written in πŸ¦€ Rust

A simple image average color extractor written in πŸ¦€ Rust

Comments
Releases(v1.0.0)
Owner
null
Signed distance field font and image command line tool based on OpenCL.

SDFTool Signed distance field font and image command line tool based on OpenCL. Build Windows Run cargo build --release in Visual Studio developer x64

弦语蝢撦 7 Oct 16, 2022
Control Google Pixel Buds Pro from the Linux command line.

pbpctrl Control Google Pixel Buds Pro from the Linux command line. Might or might not work on other Pixel Buds devices. Allows reading of battery, har

Maximilian Luz 6 Jan 10, 2023
Thumbnailer - This crate can be used to create thumbnails for all kinds of files.

Thumbnailer This crate can be used to create thumbnails for all kinds of files. Usage use thumbnailer::{create_thumbnails, Thumbnail, ThumbnailSize};

Julius Riegel 4 Oct 30, 2022
Converts images into textual line art.

img2utf Transform images to textual line art! Images require pre-processing to come out nice. It's expected that users will do the following: Apply ga

Lee 149 Dec 10, 2022
A cross platform reloading image viewer combined with a command executor

Watchout A cross platform reloading image viewer combined with a command executor. Watchout will do any of the following: Run a command when any file

Benedikt Terhechte 6 Oct 7, 2022
A Rust library for calculating perceptual hash values of images

img_hash Now builds on stable Rust! (But needs nightly to bench.) A library for getting perceptual hash values of images. Thanks to Dr. Neal Krawetz f

Austin Bonander 264 Dec 9, 2022
Encoding and decoding images in Rust

Image Maintainers: @HeroicKatora, @fintelia How to contribute An Image Processing Library This crate provides basic image processing functions and met

image-rs 3.5k Jan 9, 2023
Rust bindings for OpenCV 3 & 4

Rust OpenCV bindings Experimental Rust bindings for OpenCV 3 and 4. The API is usable, but unstable and not very battle-tested; use at your own risk.

null 1.2k Dec 30, 2022
A simple steganography library written in rust

steganography A stable steganography library written in rust Crates.io Usage Add the following to the Cargo.toml in your project: [dependencies] stega

Teodor Voinea 79 Dec 9, 2022
Zero dependency images (of chaos) in Rust

bifurcate-rs Zero dependency images (of chaos) in Rust To run: time cargo run --release > img.pgm To convert from PGM to PNG using Image Magick: conve

Stephen Merity 32 Nov 17, 2021