A crate using DeepSpeech bindings to convert mic audio from speech to text

Last update: Oct 8, 2022

Related tags

Overview

DS-TRANSCRIBER

Need an Offline Speech To Text converter?

Records your mic, and returns a String containing what was said.

Features

Begins transcription after a long enough pause has been detected
Change the length of the pause required to begin transcription
Shows you the audio levels of what is being recorded so that you can...
Change the audio level of what you deem as silence in your environment

Ds-Transcriber is backed by the awesome cpal for streaming and nnnoiseless for audio cleanup.

Setting Up

You need to obtain the deepspeech-model (tested with 0.9.x) and the native-client for your system and add that folder to your LD_LIBRARY_PATH and LIBRARY_PATH environment variables. See the quick start guide over at deepspeech-rs.

Usage

Add the crate to your Cargo.toml

[dependencies]
ds-transcriber = "0.1.3"

Instantiate your model:

    // the path where your model and native-client lie
    let model_dir_str = args().nth(1).expect("Please specify model dir");
    let mut ds_model = DeepSpeechModel::instantiate_from(model_dir_str);
    let model = ds_model.model();

Create a mutable configuration with your model

    let mut config = ds_transcriber::transcriber::StreamSettings {
        //value used for pause detection, a pause is detected when the amplitude is less than this
        silence_level: 200,
        // takes a reference of the model we instantiated earlier
        model,
        // show the amplitude values on stdout (helps you to find your silence level)
        show_amplitudes: true,
        // seconds of silence indicating end of speech (begin transcription when pause_length is greater than....)
        pause_length: 2.0,
    };

After getting config ready, all you need to do is pass it to transcribe:

    let i_said = ds_transcriber::transcriber::transcribe(&mut config).unwrap();
    println!("I said: {}", i_said);

Repeat the last step to get another transcription with the same configuration.

Contributions

Heck yeah! Pull requests are the greatest thing since sliced bread.

License

MIT

Free Software, Gotta love it!

fastest text uwuifier in the west

uwuify fastest text uwuifier in the west transforms Hey... I think I really love you. Do you want a headpat? into hey... i think i w-weawwy wuv you.

1.2k Dec 29, 2022

Sorta Text Format in UTF-8

STFU-8: Sorta Text Format in UTF-8 STFU-8 is a hacky text encoding/decoding protocol for data that might be not quite UTF-8 but is still mostly UTF-8.

18 Sep 4, 2022

The fastest way to identify any mysterious text or analyze strings from a file, just ask `lemmeknow` !

The fastest way to identify anything lemmeknow ⚡ Identify any mysterious text or analyze strings from a file, just ask lemmeknow. lemmeknow can be use

594 Dec 30, 2022

Source text parsing, lexing, and AST related functionality for Deno

Source text parsing, lexing, and AST related functionality for Deno.

90 Jan 1, 2023

better tools for text parsing

nom-text Goal: a library that extends nom to provide better tools for text formats (programming languages, configuration files). current needs Recogni

5 Oct 18, 2022

Font independent text analysis support for shaping and layout.

lipi Lipi (Sanskrit for 'writing, letters, alphabet') is a pure Rust crate that provides font independent text analysis support for shaping and layout

12 Sep 22, 2022

lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike.

lingua-py lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike. Installation pip install l

7 Dec 30, 2022

Makdown-like text parser.

1 Dec 7, 2021

A Rust wrapper for the Text synthesization service TextSynth API

2 Mar 24, 2022

Releases(v1.0.0)

v1.0.0(Mar 5, 2022)
Derive Default, Clone, Copy for StreamSettings

Add Debug implementations for all public types

Error unwinding - transcribe() now returns Result<T,E> instead of Option<T>

transcribe() now accepts StreamSettings and &mut Model

Replace DeepSpeechModel::instantiate_from() method with instance_model() function

Directly specify paths of model file and (optional) scorer instead of searching a specified directory

Encapsulate internal modules intended for internal use (only)

Denoising is now activated through a cargo feature with the name denoise

Bug fix - infinite recursion on non-mono input devices bc85b1a22d506e26f8f3b49268f5717cfdae5a14

Adds logging

Adds a basic transcription example

Bumps Rust edition to 2021 272c612b84db795957a857a40c3ce750668d7c0d

nnnoiseless (denoise) has been reexported fa299132b528078124d6a91bc6d71989e439a2b9

deepspeech-rs has been reexported fa299132b528078124d6a91bc6d71989e439a2b9

What's Changed

Fix: Infinite loop on channel configuration by @kawaki-san in https://github.com/kawaki-san/ds-transcriber/pull/5

v1.0.0-beta by @kawaki-san in https://github.com/kawaki-san/ds-transcriber/pull/6

v1.0.0 by @kawaki-san in https://github.com/kawaki-san/ds-transcriber/pull/7

Full Changelog: https://github.com/kawaki-san/ds-transcriber/compare/v0.1.3...v1.0.0
Source code(tar.gz)
Source code(zip)
v1.0.0-beta(Mar 3, 2022)
Public API changes:

Derive Default, Clone, Copy for StreamSettings

Error unwinding - transcribe() now returns Result<T,E> instead of Option<T>

transcribe() now accepts StreamSettings and &mut Model

Replace DeepSpeechModel::instantiate_from() method with instance_model() function

Directly specify paths of model file and (optional) scorer instead of searching a specified directory

Change module visibility for functionality that's internal only

Denoising is now activated through a cargo feature with the name denoise

Bug fix - infinite recursion on non-mono input devices bc85b1a22d506e26f8f3b49268f5717cfdae5a14

Adds logging

Adds a basic transcription example

Bumps Rust edition to 2021 272c612b84db795957a857a40c3ce750668d7c0d

nnnoiseless (denoise) has been reexported fa299132b528078124d6a91bc6d71989e439a2b9

deepspeech-rs has been reexported fa299132b528078124d6a91bc6d71989e439a2b9

Adds documentation

Source code(tar.gz)
Source code(zip)
v0.1.3(May 31, 2021)

Allows for Model and StreamSettings reuse
Source code(tar.gz)
Source code(zip)
v0.1.2(May 30, 2021)

Reset the pause length if speech is detected before pause length is exceeded
Source code(tar.gz)
Source code(zip)