A crate using DeepSpeech bindings to convert mic audio from speech to text

Overview

DS-TRANSCRIBER

Need an Offline Speech To Text converter?

Records your mic, and returns a String containing what was said.

Features

  • Begins transcription after a long enough pause has been detected
  • Change the length of the pause required to begin transcription
  • Shows you the audio levels of what is being recorded so that you can...
  • Change the audio level of what you deem as silence in your environment

Ds-Transcriber is backed by the awesome cpal for streaming and nnnoiseless for audio cleanup.

Setting Up

You need to obtain the deepspeech-model (tested with 0.9.x) and the native-client for your system and add that folder to your LD_LIBRARY_PATH and LIBRARY_PATH environment variables. See the quick start guide over at deepspeech-rs.

Usage

Add the crate to your Cargo.toml

[dependencies]
ds-transcriber = "0.1.3"

Instantiate your model:

    // the path where your model and native-client lie
    let model_dir_str = args().nth(1).expect("Please specify model dir");
    let mut ds_model = DeepSpeechModel::instantiate_from(model_dir_str);
    let model = ds_model.model();

Create a mutable configuration with your model

    let mut config = ds_transcriber::transcriber::StreamSettings {
        //value used for pause detection, a pause is detected when the amplitude is less than this
        silence_level: 200,
        // takes a reference of the model we instantiated earlier
        model,
        // show the amplitude values on stdout (helps you to find your silence level)
        show_amplitudes: true,
        // seconds of silence indicating end of speech (begin transcription when pause_length is greater than....)
        pause_length: 2.0,
    };

After getting config ready, all you need to do is pass it to transcribe:

    let i_said = ds_transcriber::transcriber::transcribe(&mut config).unwrap();
    println!("I said: {}", i_said);

Repeat the last step to get another transcription with the same configuration.

Contributions

Heck yeah! Pull requests are the greatest thing since sliced bread.

License

MIT

Free Software, Gotta love it!

You might also like...
fastest text uwuifier in the west

uwuify fastest text uwuifier in the west transforms Hey... I think I really love you. Do you want a headpat? into hey... i think i w-weawwy wuv you.

Sorta Text Format in UTF-8

STFU-8: Sorta Text Format in UTF-8 STFU-8 is a hacky text encoding/decoding protocol for data that might be not quite UTF-8 but is still mostly UTF-8.

The fastest way to identify any mysterious text or analyze strings from a file, just ask `lemmeknow` !
The fastest way to identify any mysterious text or analyze strings from a file, just ask `lemmeknow` !

The fastest way to identify anything lemmeknow ⚡ Identify any mysterious text or analyze strings from a file, just ask lemmeknow. lemmeknow can be use

Source text parsing, lexing, and AST related functionality for Deno

Source text parsing, lexing, and AST related functionality for Deno.

better tools for text parsing

nom-text Goal: a library that extends nom to provide better tools for text formats (programming languages, configuration files). current needs Recogni

Font independent text analysis support for shaping and layout.

lipi Lipi (Sanskrit for 'writing, letters, alphabet') is a pure Rust crate that provides font independent text analysis support for shaping and layout

lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike.

lingua-py lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike. Installation pip install l

Makdown-like text parser.

Makdown-like text parser.

A Rust wrapper for the Text synthesization service TextSynth API

A Rust wrapper for the Text synthesization service TextSynth API

Releases(v1.0.0)
  • v1.0.0(Mar 5, 2022)

    • Derive Default, Clone, Copy for StreamSettings
    • Add Debug implementations for all public types
    • Error unwinding - transcribe() now returns Result<T,E> instead of Option<T>
    • transcribe() now accepts StreamSettings and &mut Model
    • Replace DeepSpeechModel::instantiate_from() method with instance_model() function
    • Directly specify paths of model file and (optional) scorer instead of searching a specified directory
    • Encapsulate internal modules intended for internal use (only)
    • Denoising is now activated through a cargo feature with the name denoise
    • Bug fix - infinite recursion on non-mono input devices bc85b1a22d506e26f8f3b49268f5717cfdae5a14
    • Adds logging
    • Adds a basic transcription example
    • Bumps Rust edition to 2021 272c612b84db795957a857a40c3ce750668d7c0d
    • nnnoiseless (denoise) has been reexported fa299132b528078124d6a91bc6d71989e439a2b9
    • deepspeech-rs has been reexported fa299132b528078124d6a91bc6d71989e439a2b9

    What's Changed

    • Fix: Infinite loop on channel configuration by @kawaki-san in https://github.com/kawaki-san/ds-transcriber/pull/5
    • v1.0.0-beta by @kawaki-san in https://github.com/kawaki-san/ds-transcriber/pull/6
    • v1.0.0 by @kawaki-san in https://github.com/kawaki-san/ds-transcriber/pull/7

    Full Changelog: https://github.com/kawaki-san/ds-transcriber/compare/v0.1.3...v1.0.0

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0-beta(Mar 3, 2022)

    • Public API changes:

      • Derive Default, Clone, Copy for StreamSettings
      • Error unwinding - transcribe() now returns Result<T,E> instead of Option<T>
      • transcribe() now accepts StreamSettings and &mut Model
      • Replace DeepSpeechModel::instantiate_from() method with instance_model() function
      • Directly specify paths of model file and (optional) scorer instead of searching a specified directory
      • Change module visibility for functionality that's internal only
      • Denoising is now activated through a cargo feature with the name denoise
    • Bug fix - infinite recursion on non-mono input devices bc85b1a22d506e26f8f3b49268f5717cfdae5a14
    • Adds logging
    • Adds a basic transcription example
    • Bumps Rust edition to 2021 272c612b84db795957a857a40c3ce750668d7c0d
    • nnnoiseless (denoise) has been reexported fa299132b528078124d6a91bc6d71989e439a2b9
    • deepspeech-rs has been reexported fa299132b528078124d6a91bc6d71989e439a2b9
    • Adds documentation
    Source code(tar.gz)
    Source code(zip)
  • v0.1.3(May 31, 2021)

  • v0.1.2(May 30, 2021)

Owner
null
murasaki: Nostr to Speech (in Japanese)

murasaki: Nostr to Speech ⚠ このソフトウェアはα版です ⚠ VOICEVOX を利用したタイムライン読み上げツールです。 指定したリレーのグローバルタイムライン、または指定した公開鍵でフォローしているユーザのタイムラインを読み上げます。 つかいかた Rust をインストー

Yoji Shidara 16 Mar 27, 2023
Text Expression Runner – Readable and easy to use text expressions

ter - Text Expression Runner ter is a cli to run text expressions and perform basic text operations such as filtering, ignoring and replacing on the c

Maximilian Schulke 72 Jul 31, 2022
convert koreader highlights and notes to markdown

convert koreader highlights and notes to markdown

Wesley Aptekar-Cassels 2 Jan 1, 2022
Text calculator with support for units and conversion

cpc calculation + conversion cpc parses and evaluates strings of math, with support for units and conversion. 128-bit decimal floating points are used

Kasper 82 Jan 4, 2023
Find and replace text in source files

Ruplacer Find and replace text in source files: $ ruplacer old new src/ Patching src/a_dir/sub/foo.txt -- old is everywhere, old is old ++ new is ever

Tanker 331 Dec 28, 2022
An efficient and powerful Rust library for word wrapping text.

Textwrap Textwrap is a library for wrapping and indenting text. It is most often used by command-line programs to format dynamic output nicely so it l

Martin Geisler 322 Dec 26, 2022
bottom encodes UTF-8 text into a sequence comprised of bottom emoji

bottom encodes UTF-8 text into a sequence comprised of bottom emoji (with , sprinkled in for good measure) followed by ????. It can encode any valid UTF-8 - being a bottom transcends language, after all - and decode back into UTF-8.

Bottom Software Foundation 345 Dec 30, 2022
👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike

Table of Contents What does this library do? Why does this library exist? Which languages are supported? How good is it? Why is it better than other l

Peter M. Stahl 569 Jan 3, 2023
Semantic text segmentation. For sentence boundary detection, compound splitting and more.

NNSplit A tool to split text using a neural network. The main application is sentence boundary detection, but e. g. compound splitting for German is a

Benjamin Minixhofer 273 Dec 29, 2022
A fast, low-resource Natural Language Processing and Text Correction library written in Rust.

nlprule A fast, low-resource Natural Language Processing and Error Correction library written in Rust. nlprule implements a rule- and lookup-based app

Benjamin Minixhofer 496 Jan 8, 2023