format whisper transcripts to .srt

Overview

whispersub

A dead simple utility to format the output of OpenAI's whisper model (or whisper.cpp) into an .srt file.

Usage

whispersub input.txt -o output.srt

you can also pipe the output of whisper.cpp into whispersub

whisper-cpp --file audio.wav --language en --model ggml-medium.en.bin | whispersub

or use a little hellper function to extract the audio from a video, pipe it to whisper.cpp and then to whispersub

makesub () {
    filename=$(basename -- "$1")
    filename="${filename%.*}"
    model=${HOME}/.local/share/whisper/ggml-medium.en.bin
    ffmpeg -i "$1" -vn -acodec pcm_s16le -ar 16000 -ac 2 -f wav - | 
    nice -n 20 whisper-cpp --threads "$(nproc)" --file - --language en --model "$model" |
    whispersub -o "${filename}.en.srt"
}

makesub video.mp4
You might also like...
UniSBOM is a tool to build a software bill of materials on any platform with a unified data format.

UniSBOM is a tool to build a software bill of materials on any platform with a unified data format. Work in progress Support MacOS Uses system_profile

An apocalypse-resistant data storage format for the truly paranoid.

Carbonado An apocalypse-resistant data storage format for the truly paranoid. Designed to keep encrypted, durable, compressed, provably replicated con

Grid-based drum sequencer plugin as MIDI FX in CLAP/VST3 format
Grid-based drum sequencer plugin as MIDI FX in CLAP/VST3 format

dr-seq Grid-based drum sequencer plugin as MIDI FX in CLAP/VST3 format. WARNING: This project is in a very early state. So there is no guarantee for a

jf "jf: %q" "JSON Format"

jf jf "jf: %q" "JSON Format" jf is a jo alternative to help safely format and print JSON objects in the commandline. However, unlike jo, where you bui

A tool to filter sites in a FASTA-format whole-genome pseudo-alignment

Core-SNP-filter This is a tool to filter sites (i.e. columns) in a FASTA-format whole-genome pseudo-alignment based on: Whether the site contains vari

Polyexen demo of Plonkish Arithmetiation Format (Plaf) on the zkevm-circuits

Plaf demo This is a demo of Plaf: Plonkish Arithmetiation Format on the zkevm-circuits Steps to run this: Clone these three repositories in the same f

a command-line tool that transforms a Git repository into a minimal format for ChatGPT queries
a command-line tool that transforms a Git repository into a minimal format for ChatGPT queries

gprepo /dʒiːpiːˈɹi:pi:oʊ/ a command-line tool that transforms a Git repository into a minimal format for ChatGPT queries. Features Excludes LICENSE an

Coffee is a loader for ELF (Executable and Linkable Format) object files written in Rust
Coffee is a loader for ELF (Executable and Linkable Format) object files written in Rust

Coffee is a loader for ELF (Executable and Linkable Format) object files written in Rust. It provides a mechanism to load and parse ELF files similar to COFFLoader, but specifically designed for ELF files used in Unix-like systems.

A tool to format codeblocks inside markdown and org documents.
A tool to format codeblocks inside markdown and org documents.

cbfmt (codeblock format) A tool to format codeblocks inside markdown, org, and restructuredtext documents. It iterates over all codeblocks, and format

Owner
Mike Dallas
Mike Dallas
Given a set of kmers (fasta format) and a set of sequences (fasta format), this tool will extract the sequences containing the kmers.

Kmer2sequences Description Given a set of kmers (fasta / fastq [.gz] format) and a set of sequences (fasta / fastq [.gz] format), this tool will extra

Pierre Peterlongo 22 Sep 16, 2023
⚗️ Superfast CLI interface for the conventional commits commit format

resin ⚗️ Superfast CLI interface for the conventional commits commit format ❓ What is resin? resin is a CLI (command-line interface) tool that makes i

Matt Gleich 23 Oct 12, 2022
⚗️ Superfast CLI interface for the conventional commits commit format

resin ⚗️ Superfast CLI interface for the conventional commits commit format ❓ What is resin? resin is a CLI (command-line interface) tool that makes i

Matt Gleich 23 Oct 12, 2022
Crate to generate files in ROFF format (Rust)

roffman A crate to generate roff man pages. Usage Add the following to the Cargo.toml: [dependencies] roffman = "0.3" Example use roffman::{Roff, Roff

Wojciech Kępka 23 Jul 13, 2022
CLI tool that make it easier to perform multiple lighthouse runs towards a single target and output the result in a plotable format.

Lighthouse Aggregator CLI tool that make it easier to perform multiple lighthouse runs towards a single target and output the result in a "plotable" f

Polestar 1 Jan 12, 2022
Single File Assets is a file storage format for images

SFA (Rust) Single File Assets is a file storage format for images. The packed images are not guaranteed to be of same format because the format while

null 1 Jan 23, 2022
CLI application to run clang-format on a set of files specified using globs in a JSON configuration file.

run_clang_format CLI application for running clang-format for an existing .clang-format file on a set of files, specified using globs in a .json confi

martin 6 Dec 16, 2022
Format codebase in documentation 🦤

Gelatyx Format codebase in documentation ?? Features Format language code block inside documentation files Check mode. Ask Gelatyx is the documentatio

azzamsa 3 Oct 24, 2022
A low-level MVCC file format for storing blobs.

Sediment This repository isn't ready for public consumption. It just reached a stage where I wanted to start sharing ideas with others as well as usin

Khonsu Labs 24 Jan 8, 2023
Databento Binary Encoding (DBZ) - Fast message encoding and storage format for market data

dbz A library (dbz-lib) and CLI tool (dbz-cli) for working with Databento Binary Encoding (DBZ) files. Python bindings for dbz-lib are provided in the

Databento, Inc. 15 Nov 4, 2022