An efficient and powerful Rust library for word wrapping text.

Overview

Textwrap

Textwrap is a library for wrapping and indenting text. It is most often used by command-line programs to format dynamic output nicely so it looks good in a terminal. However, you can use the library to wrap arbitrary things by implementing the Fragment trait — an example would be wrapping text for PDF files.

Usage

To use the textwrap crate, add this to your Cargo.toml file:

[dependencies]
textwrap = "0.13"

By default, this enables word wrapping with support for Unicode strings. Extra features can be enabled with Cargo features — and the Unicode support can be disabled if needed. This allows you slim down the library and so you will only pay for the features you actually use. Please see the Cargo Features in the crate documentation for a full list of the available features.

Documentation

API documentation

Getting Started

Word wrapping is easy using the fill function:

fn main() {
    let text = "textwrap: an efficient and powerful library for wrapping text.";
    println!("{}", textwrap::fill(text, 28));
}

The output is wrapped within 28 columns:

textwrap: an efficient
and powerful library for
wrapping text.

Sharp-eyed readers will notice that the first line is 22 columns wide. So why is the word “and” put in the second line when there is space for it in the first line?

The explanation is that textwrap does not just wrap text one line at a time. Instead, it uses an optimal-fit algorithm which looks ahead and chooses line breaks which minimize the gaps left at ends of lines.

Without look-ahead, the first line would be longer and the text would look like this:

textwrap: an efficient and
powerful library for
wrapping text.

The second line is now shorter and the text is more ragged. The kind of wrapping can be configured via Option::wrap_algorithm.

If you enable the hyphenation Cargo feature, you get support for automatic hyphenation for about 70 languages via high-quality TeX hyphenation patterns.

Your program must load the hyphenation pattern and configure Options::splitter to use it:

use hyphenation::{Language, Load, Standard};
use textwrap::Options;

fn main() {
    let hyphenator = Standard::from_embedded(Language::EnglishUS).unwrap();
    let options = Options::new(28).splitter(hyphenator);
    let text = "textwrap: an efficient and powerful library for wrapping text.";
    println!("{}", fill(text, &options);
}

The output now looks like this:

textwrap: an efficient and
powerful library for wrap-
ping text.

The US-English hyphenation patterns are embedded when you enable the hyphenation feature. They are licensed under a permissive license and take up about 88 KB in your binary. If you need hyphenation for other languages, you need to download a precompiled .bincode file and load it yourself. Please see the hyphenation documentation for details.

Wrapping Strings at Compile Time

If your strings are known at compile time, please take a look at the procedural macros from the textwrap-macros crate.

Examples

The library comes with a collection of small example programs that shows various features. You’re invited to clone the repository and try them out for yourself!

Of special note is the interactive example. This is a demo program which demonstrates most of the available features: you can enter text and adjust the width at which it is wrapped interactively. You can also adjust the Options used to see the effect of different WordSplitters and wrap algorithms.

Run the demo with

$ cargo run --example interactive

The demo needs a Linux terminal to function.

Release History

Please see the CHANGELOG file for details on the changes made in each release.

License

Textwrap can be distributed according to the MIT license. Contributions will be accepted under the same license.

Comments
  • Tab width support

    Tab width support

    This should add basic support for specifying a tab width within the Options struct, as discussed in #419. As it turns out, I had to add a tab_width field within both Options and Word, though that was not what was planned.

    This PR should include all extra documentation needed to support this feature, and also includes a few more tests with the areas of this library that I understand well enough to add tests for (and passes all tests as well).

    There are some functions in the library that I couldn't find a way to add support for tab widths within (specifically, unfill, since you have no way to pass in options, and refill, since you can't pass in the options to unfill so you can't be certain it was actually unfilled correctly). wrap, wrap_columns, and fill_inplace all have tests, though, and should work just as expected.

    I am very open to criticism and suggestions for how differently this should be implemented if anyone feels that this is not the best way. Also please ask any questions you have, I would understand if my reasoning for parts of this is somewhat confusing.

    opened by itsjunetime 22
  • Implement wrapping functions as iterators

    Implement wrapping functions as iterators

    This is the first mergeable PR using the ideas from #244.

    Motivation:

    • no_std. The current implementation of wrap_first_fit does not require allocation at any stage, and so can be used in no_std contexts if we support that in the future.
    • Avoiding allocations for the return value, for efficency/performance reasons.
    • Wrapping without creating a slice. wrap_first_fit and wrap_optimal_fit now do not require the fragments to be stored contiguously, and instead simply require any iterator over fragments. This is done without losing any performance (in theory at least). This allows piping the result of splitting functions directly into wrapping functions without collecting into a vector first.
    • Owned data. It is more flexible to return owned fragments from these functions rather than subslices, as with slices it forces the data to be immutable.

    Implementation notes:

    • I had to implement Fragment for &F so that passing a slice into wrap_first_fit won't break (as iterating over a slice produces references).
    • The WrapFirstFit and WrapOptimalFit iterators iterate over (F, bool) tuples where the bool indicates whether it's the last fragment of its line; from experience of porting this library's code and tests to the new system it is about equally easy to use.
    opened by Kestrer 21
  • No explicit control over CR LF line ending

    No explicit control over CR LF line ending

    Helix editor recently added a feature to reflow a piece of text; helix uses textwrap for this

    There's an issue, where reflow does not work correctly if CRLF sequence (\r\n) is used as a line terminator https://github.com/helix-editor/helix/issues/2645

    helix does not support CR ending (pre macosx), but I guess, the issue would manifest with it as well.

    As far as I can see, textwrap does not account for different line ending, there's no configuration option with which we could tune textwrap behavior.

    Did I miss anything? Is there a way to support CRLF in textwrap?

    If not, I'd be willing to contribute a fix. My initial idea is to introduce another option into the Options struct, named line_ending (or line_terminator as a fancier alternative). To keep things tight, I'd express it as enum (we can always create something like Custom should we want to support an arbitrary byte sequence later, but I highly doubt that). How does that sound?

    opened by koiuo 20
  • Integer size and overflow when calculating penalty

    Integer size and overflow when calculating penalty

    When testing the optimal-fit algorithm, I encountered a panic during the cost calculations:

    thread 'main' panicked at 'attempt to multiply with overflow', .../textwrap-0.13.0/src/core.rs:710:21
    

    I fixed this by multiplying my mm widths with the factor 100 instead of 1000, which should still be a sufficient precision. But looking at the cost calculations, I have two follow-up questions:

    • Why are the costs stored as a i32 instead of a usize? They are generated by adding positive usize values, positive integer constants and other costs, so usize should work too.
    • How should overflows be handled when calculating the penalty? I changed the i32 type to usize, causing an integer overflow at the same location for the max_width test case that sets the line width to usize::MAX. This is currently working, probably because the huge value target_width - line_width is cropped when casting to i32 before updating the cost. Shouldn’t the cost calculation use saturating additions and multiplications if large line widths are supported?
    bug 
    opened by robinkrahl 17
  • Modularize library

    Modularize library

    First of all, apologies for the enormous PR - I initially tried to do incremental changes but it became clear that a rewrite from the ground up was required, so here we are.

    This rewrite had several motivations that explain many of the design decisions:

    • The ability to support multiple splitting algorithms in the future, such as ICU.
    • The ability to support non-terminal contexts that have different width calculations than unicode-width.
    • Currently, the library assumes that terminals will display characters using their Unicode width, however this is an incorrect assumption: iTerm2 for example displays 👨‍👨‍👧‍👦 (a family emoji) in two columns instead of eight as unicode-width would report. This point ties into the last one: allowing swapping out unicode-width for other functions.
    • The possibility of supporting no_std in the future; none of the main algorithms currently allocate or use any std features.
    • Favouring modularity over configuration - having the core API be entirely modular allows it to be implemented in a far simpler way and allows it to be more extensible. However it can be fairly verbose, I can add a configuration layer on top of it if you wish.

    To understand the precise changes that were made, it's probably best to just to open it in Rustdoc.

    This is far from complete. I still have to re-add hyphenation support, tests, benchmarks, examples, shorthand functions et cetera, but the base functionality is there. By the way, I'm not expecting this API to be added as-is; this PR just represents one extreme of how modular this library could be. We could compromise between this and the current one.

    opened by Kestrer 17
  • Allow the use of the library outside of the context of the terminal

    Allow the use of the library outside of the context of the terminal

    Hello, this is a feature request.

    It could be really useful to be able to use this library outside of the context of the CLI. I'm thinking about Piston text, that provides rendered characters with their widths depending on font-size, TrueType Font, etc... I just browsed the source code, and I believe supporting it would only be a matter of allowing to insert custom sizes for characters, but I am not sure as I obviously do not grasp all of the big picture. It would also require to handle floating point numbers in some way, as maybe the size could not be an integer. If you believe it's a good idea and is easy enough, I could do it myself.

    Thanks in advance!

    opened by Moxinilian 16
  • Does not work for languages without word separators

    Does not work for languages without word separators

    For example, fill("កើតមកមានសេរីភាព", 6) outputs កើតមកម, ានសេរភ and ាព when it should output កើតមក, មាន and សេរីភាព. See also w3's Approaches to line breaking document which has the correct ways to line break words; implementing support for this would require storing a dictionary and matching words in it.

    help wanted 
    opened by Kestrer 15
  • Reintroducing the type parameter on `Options` (previously known as `Wrapper`).

    Reintroducing the type parameter on `Options` (previously known as `Wrapper`).

    This PR reverts the internal changes of #206, but keeps the public API compatible to what it introduced. Essentially by adding a default for the type parameter.

    However, now in addition to the dynamic dispatch by default, one may also explicitly use static dispatch by specifying a concrete type parameter. This now allows to construct an Options instance in const/static context (or introduces, it depending on the point of view, as it was possible before #206). Which is further facilitated by adding the const fns new_const and with_splitter.

    Open Questions

    • The name of Options::new_const could be changed (maybe to new_static). Or, for instance, the Options::new_const could replace the current Options::new entirely, reverting to a static dispatch by default. Then instead, a Options::new_boxed (or new_dyn or something) could be added which then creates a boxed Options with dynamic dispatch. Yet another alternative, would be to just remove Options::new_const, since the fundamental use-case (const fn) is also covered by Options::with_splitter.
    • Whether to implement WordSplitter only for Box<dyn WordSplitter> or for all Box<S>
    • Whether to implement splitter setter method only on the dynamic dispatch variant (aka Options<Box<dyn WordSplitter>>). Then, it might also take an arbitrary impl WordSplitter by value and put it into a Box.
    • Now, it is also possible to make Options Clone again. Well, at least where the type parameter is Clone, which it isn't for the default Box<dyn WordSplitter>.

    comments & other suggestions are welcome

    opened by Cryptjar 14
  • hyphenation is no longer maintained

    hyphenation is no longer maintained

    It appears that hyphenation is no longer being maintained, and as a result, it is blocking ripgrep from being packaged in Debian. I'm opening this issue to see if you're willing to brainstorm ideas on a path forward here.

    Unless hyphenation becomes actively maintained, I think the only path forward is to find a way to remove it from textwrap's dependency tree. Here are some ideas:

    • textwrap could vendor hyphenation's functionality and provide it itself. This means the maintenance burden gets shifted to the maintainers of this crate.
    • Someone forks and maintains hyphenation by publishing a different crate that provides the same functionality, and then textwrap switches to that fork. This means the maintenance burden gets shifted to whoever makes the fork.
    • I fork hyphenation and maintain it myself. I'd prefer not to do this, because if I do, I'm going to want to dig into it and solve the problem from first principles. (I don't know whether the existing implementation is the same one that I would arrive at.) However, it might be possible that textwrap only needs a small piece of functionality from hyphenation. If so, perhaps I could just provide that, but I haven't investigated yet.

    What are your thoughts?

    opened by BurntSushi 13
  • Separate soft line break finding and wrapping

    Separate soft line break finding and wrapping

    Currently this library does the soft line break finding and wrapping in one step to go from &str to String. However this is very inefficient for static blocks of text, which will require recalculating the soft line break positions every single time the viewport is resized. My suggestion is to have the wrapping functions take in an impl Iterator<Item = &str> and use that instead. Then instead of having the complex WordSplitter trait, we can have simple functions that take an &str and return an Iterator<Item = &str>.

    If you're happy with the idea then I will work on this, I just need to get some feedback first.

    opened by Kestrer 12
  • Add an iterator version of `wrap`

    Add an iterator version of `wrap`

    • Add #[derive(Clone)] for word splitter types defined in this crate (hyphenation::Corpus already has a Clone impl);
    • Add a Clone requirement (which should be considered a breaking change but I think there are no custom word splitters to worry about);
    • Adapt docs.
    opened by hcpl 11
  • Add `tab_width` support

    Add `tab_width` support

    This merge request is an updated version of #429. It allows the width of a tab to be configured for a variety of this crate's functions, instead of the width being fixed at zero, as it currently is.

    opened by mtoohey31 2
  • Unable to compile to WASM with version 0.15.1

    Unable to compile to WASM with version 0.15.1

    Hello! I've just upgraded to 0.15.1 and can no longer compile this to WASM.

    This versions pulls in terminal_size 0.2.1, which pulls in rustix, which pulls in linux-raw-sys, which does not compile on wasm.

    I see the dep is marked as optional in textwrap's Cargo.toml, but I couldn't work out how to stop it from being included. I have this in my Cargo.toml:

    textwrap = { version = "0.15.0", default_features = false, features = [] }
    

    How can I resolve this?

    Thanks, Louis

    opened by lpil 14
  • Add example which wraps text on `stdin`

    Add example which wraps text on `stdin`

    As mentioned in https://github.com/sharkdp/bat/issues/305#issuecomment-1176321478, it could be useful to have a command which reads lines from stdin and sends wrapped output to stdout.

    opened by mgeisler 0
  • Incorrect width measurement for tabs?

    Incorrect width measurement for tabs?

    I may be speaking out of my depth here, so let me know if what I'm saying doesn't make sense.

    It seems that this library always measures tabs to have 0 for their width (when the unicode_width feature is enabled), thus causing some text to not be wrapped correctly when tabs are displayed with a width of one or more spaces. This happens because character widths are calculated in core.rs, with fn ch_width(ch: char) -> usize (here), which calls into unicode_width (here), calling further into unicode_width here, which tells any unicode character less than 0x20 to return None. This includes tabs (as they are 0x09), and that None is then unwrapped to 0.

    Because of this, if tabs are being displayed to the user with the width of 2, 4, or 8 spaces (as they are for most people), textwrap will calculate the width of a line to be lower than it actually will be when presented to the user. This will mean that lines will not be wrapped correctly, and will still overflow when they are not supposed to.

    Is this intended behavior (e.g. don't want to decide, for all users, what the width of a tab should be, so just don't deal with that)? If it's not, how do we let users configure how wide a tab should be for their uses?

    Let me know what your thoughts are, I'd love to get this resolved.

    opened by itsjunetime 3
  • Make word separators and splitters more flexible

    Make word separators and splitters more flexible

    This PR makes the word separators and splitters more flexible to allow users to use their own words types. It also adds an example that shows how to use textwrap with custom word types, namely with styled strings.

    As this is only a draft, I did not add much documentation.

    The basic changes are:

    • For word separation, introduce a new method that just returns the range of the detected words. Users can then apply their own logic for creating words from these ranges.
    • For word splitting, introduce a new Fragments iterator struct that yields the fragments for a word. It turned out to be easier to introduce a Splittable trait than to use a closure to perform the splitting, but this could also be changed. I used a new struct for the iterator because that makes it much easier to keep track of the life times.
    opened by robinkrahl 5
  • Generic word separators and splitters

    Generic word separators and splitters

    textwrap::wrap_algorithm already works with the Fragment trait. What do you think about making the textwrap::word_separators and textwrap::word_splitters modules generic?

    As far as I see, it would not even be necessary to introduce a new trait. word_separators::WordSeparator::find_words could accept S: AsRef<str> instead of &str and a function that creates a fragment from S and two indices. word_splitters::split_words could accept W: AsRef<str> and a function that splits W at a given point with or without penalty and hyphen. (There might be some lifetime issues, but I think that should be solvable.)

    If you agree on the concept, I can prepare a draft implementation.

    opened by robinkrahl 2
Releases(0.15.2)
  • 0.15.2(Oct 24, 2022)

    Version 0.15.2 (2022-10-24)

    This release is identical to 0.15.0 and is only there to give people a way to install crates which depend on the yanked 0.15.1 release. See https://github.com/mgeisler/textwrap/issues/484 for details.

    Source code(tar.gz)
    Source code(zip)
  • 0.16.0(Oct 23, 2022)

    Version 0.16.0 (2022-10-23)

    This release marks Options as non_exhaustive and extends it to make line endings configurable, it adds new fast paths to fill and wrap, and it fixes crashes in unfill and refill.

    • #480: Mark Options as non_exhaustive. This will allow us to extend the struct in the future without breaking backwards compatibility.
    • #478: Add fast paths to fill and wrap. This makes the functions 10-25 times faster when the no wrapping is needed.
    • #468: Fix refill to add back correct line ending.
    • #467: Fix crashes in unfill and refill.
    • #458: Test with Rust 1.56 (first compiler release with support for Rust 2021).
    • #454: Make line endings configurable.
    • #448: Migrate to the Rust 2021 edition.
    Source code(tar.gz)
    Source code(zip)
  • 0.15.1(Sep 15, 2022)

    Version 0.15.1 (2022-09-15)

    This release which fixes crashes in unfill and refill. It also adds a new option to make the line endings (\n or \r\n) configurable:

    • #448: Migrate to the Rust 2021 edition.
    • #458: Test with Rust 1.56 (first compiler release with support for Rust 2021).
    • #454: Make line endings configurable.
    • #467: Fix unfill and refill crashes.
    • #468: Fix refill to add back correct line ending.
    Source code(tar.gz)
    Source code(zip)
  • 0.15.0(Feb 27, 2022)

    Version 0.15.0 (2022-02-27)

    This is a major feature release with two main changes:

    • #421: Use f64 instead of usize for fragment widths.

      This fixes problems with overflows in the internal computations of wrap_optimal_fit when fragments (words) or line lengths had extreme values, such as usize::MAX.

    • #438: Simplify Options by removing generic type parameters.

      This change removes the new generic parameters introduced in version 0.14, as well as the original WrapSplitter parameter which has been present since very early versions.

      The result is a simplification of function and struct signatures across the board. So what used to be

      let options: Options<
          wrap_algorithms::FirstFit,
          word_separators::AsciiSpace,
          word_splitters::HyphenSplitter,
      > = Options::new(80);
      

      if types are fully written out, is now simply

      let options: Options<'_> = Options::new(80);
      

      The anonymous lifetime represent the lifetime of the initial_indent and subsequent_indent strings. The change is nearly performance neutral (a 1-2% regression).

    Smaller improvements and changes:

    • #404: Make documentation for short last-line penalty more precise.
    • #405: Cleanup and simplify Options docstring.
    • #411: Default to OptimalFit in interactive example.
    • #415: Add demo program to help compute binary sizes.
    • #423: Add fuzz tests with fully arbitrary fragments.
    • #424: Change wrap_optimal_fit penalties to non-negative numbers.
    • #430: Add debug-words example.
    • #432: Use precise dependency versions in Cargo.toml.
    Source code(tar.gz)
    Source code(zip)
  • 0.14.2(Jun 27, 2021)

    Version 0.14.2 (2021-06-27)

    The 0.14.1 release included more changes than intended and has been yanked. The change intended for 0.14.1 is now included in 0.14.2.

    Source code(tar.gz)
    Source code(zip)
  • 0.14.1(Jun 26, 2021)

    Version 0.14.1 (2021-06-26)

    This release fixes a panic reported by @Makoto, thanks!

    • #391: Fix panic in find_words due to string access outside of a character boundary.
    Source code(tar.gz)
    Source code(zip)
  • 0.14.0(Jun 5, 2021)

    Version 0.14.0 (2021-06-05)

    This is a major feature release which makes Textwrap more configurable and flexible. The high-level API of textwrap::wrap and textwrap::fill remains unchanged, but low-level structs have moved around.

    The biggest change is the introduction of new generic type parameters to the Options struct. These parameters lets you statically configure the wrapping algorithm, the word separator, and the word splitter. If you previously spelled out the full type for Options, you now need to take the extra type parameters into account. This means that

    let options: Options<HyphenSplitter> = Options::new(80);
    

    changes to

    let options: Options<
        wrap_algorithms::FirstFit,
        word_separators::AsciiSpace,
        word_splitters::HyphenSplitter,
    > = Options::new(80);
    

    This is quite a mouthful, so we suggest using type inferrence where possible. You won’t see any chance if you call wrap directly with a width or with an Options value constructed on the fly. Please open an issue if this causes problems for you!

    New WordSeparator Trait

    • #332: Add WordSeparator trait to allow customizing how words are found in a line of text. Until now, Textwrap would always assume that words are separated by ASCII space characters. You can now customize this as needed.

    • #313: Add support for using the Unicode line breaking algorithm to find words. This is done by adding a second implementation of the new WordSeparator trait. The implementation uses the unicode-linebreak crate, which is a new optional dependency.

      With this, Textwrap can be used with East-Asian languages such as Chinese or Japanese where there are no spaces between words. Breaking a long sequence of emojis is another example where line breaks might be wanted even if there are no whitespace to be found. Feedback would be appreciated for this feature.

    Indent

    • #353: Trim trailing whitespace from prefix in indent.

      Before, empty lines would get no prefix added. Now, empty lines have a trimmed prefix added. This little trick makes indent much more useful since you can now safely indent with "# " without creating trailing whitespace in the output due to the trailing whitespace in your prefix.

    • #354: Make indent about 20% faster by preallocating the output string.

    Documentation

    • #308: Document handling of leading and trailing whitespace when wrapping text.

    WebAssembly Demo

    • #310: Thanks to WebAssembly, you can now try out Textwrap directly in your browser: https://mgeisler.github.io/textwrap/.

    New Generic Parameters

    • #331: Remove outer boxing from Options.

    • #357: Replace core::WrapAlgorithm enum with a wrap_algorithms::WrapAlgorithm trait. This allows for arbitrary wrapping algorithms to be plugged into the library.

    • #358: Switch wrapping functions to use a slice for line_widths.

    • #368: Move WordSeparator and WordSplitter traits to separate modules. Before, Textwrap had several top-level structs such as NoHyphenation and HyphenSplitter. These implementations of WordSplitter now lives in a dedicated word_splitters module. Similarly, we have a new word_separators module for implementations of WordSeparator.

    • #369: Rename Options::splitter to Options::word_splitter for consistency with the other fields backed by traits.

    Source code(tar.gz)
    Source code(zip)
  • 0.13.4(Feb 23, 2021)

    Version 0.13.4 (2021-02-23)

    This release removes println! statements which was left behind in unfill by mistake.

    • #296: Improve house building example with more comments.
    • #297: Remove debug prints in the new unfill function.
    Source code(tar.gz)
    Source code(zip)
  • 0.13.3(Feb 20, 2021)

    Version 0.13.3 (2021-02-20)

    This release contains a bugfix for indent and improved handling of emojis. We’ve also added a new function for formatting text in columns and functions for reformatting already wrapped text.

    • #276: Extend core::display_width to handle emojis when the unicode-width Cargo feature is disabled.
    • #279: Make indent preserve existing newlines in the input string.
    • #281: Ensure all Options fields have examples.
    • #282: Add a wrap_columns function.
    • #294: Add new unfill and refill functions.
    Source code(tar.gz)
    Source code(zip)
  • 0.13.2(Dec 30, 2020)

    Version 0.13.2 (2020-12-30)

    This release primarily makes all dependencies optional. This makes it possible to slim down textwrap as needed.

    • #254: impl WordSplitter for Box<T> where T: WordSplitter.
    • #255: Use command line arguments as initial text in interactive example.
    • #256: Introduce fuzz tests for wrap_optimal_fit and wrap_first_fit.
    • #260: Make the unicode-width dependency optional.
    • #261: Make the smawk dependency optional.
    Source code(tar.gz)
    Source code(zip)
  • 0.13.1(Dec 10, 2020)

    Version 0.13.1 (2020-12-10)

    This is a bugfix release which fixed the width computations for colored text.

    • #245: Support deleting a word with Ctrl-Backspace in the interactive demo
    • #246: Show build type (debug/release) in interactive demo
    • #249: Correctly compute width while skipping over ANSI escape sequences
    Source code(tar.gz)
    Source code(zip)
  • 0.13.0(Dec 5, 2020)

    This is a major release which rewrites the core logic, adds many new features, and fixes a couple of bugs. Most programs which use textwrap stays the same, incompatibilities and upgrade notes are given below.

    Clone the repository and run the following to explore the new features in an interactive demo (Linux only):

    $ cargo run --example interactive --all-features
    

    Bug Fixes

    Rewritten core wrapping algorithm

    • #221: Reformulate wrapping in terms of words with whitespace and penalties.

    The core wrapping algorithm has been completely rewritten. This fixed bugs and simplified the code, while also making it possible to use textwrap outside the context of the terminal.

    As part of this, trailing whitespace is now discarded consistently from wrapped lines. Before we would inconsistently remove whitespace at the end of wrapped lines, except for the last. Leading whitespace is still preserved.

    New Features

    Optimal-fit wrapping

    • #234: Introduce wrapping using an optimal-fit algorithm.

    This release adds support for new wrapping algorithm which finds a globally optimal set of line breaks, taking certain penalties into account. As an example, the old algorithm would produce

    "To be, or"
    "not to be:"
    "that is"
    "the"
    "question"
    

    Notice how the fourth line with “the” is very short. The new algorithm shortens the previous lines slightly to produce fewer short lines:

    "To be,"
    "or not to"
    "be: that"
    "is the"
    "question"
    

    Use the new textwrap::core::WrapAlgorithm enum to select between the new and old algorithm. By default, the new algorithm is used.

    The optimal-fit algorithm is inspired by the line breaking algorithm used in TeX, described in the 1981 article Breaking Paragraphs into Lines by Knuth and Plass.

    In-place wrapping

    • #226: Add a fill_inplace function.

    When the text you want to fill is already a temporary String, you can now mutate it in-place with fill_inplace:

    let mut greeting = format!("Greetings {}, welcome to the game! You have {} lives left.",
                               player.name, player.lives);
    fill_inplace(&mut greeting, line_width);
    

    This is faster than calling fill and it will reuse the memory already allocated for the string.

    Changed Features

    Wrapper is replaced with Options

    • #213: Simplify API with only top-level functions.
    • #215: Reintroducing the type parameter on Options (previously known as Wrapper).
    • #219: Allow using trait objects with fill & wrap.
    • #227: Replace WrapOptions with Into<Options>.

    The Wrapper struct held the options (line width, indentation, etc) for wrapping text. It was also the entry point for actually wrapping the text via its methods such as wrap, wrap_iter, into_wrap_iter, and fill methods.

    The struct has been replaced by a simpler Options struct which only holds options. The Wrapper methods are gone, their job has been taken over by the top-level wrap and fill functions. The signature of these functions have changed from

    fn fill(s: &str, width: usize) -> String;
    
    fn wrap(s: &str, width: usize) -> Vec<Cow<'_, str>>;
    

    to the more general

    fn fill<'a, S, Opt>(text: &str, options: Opt) -> String
    where
        S: WordSplitter,
        Opt: Into<Options<'a, S>>;
    
    fn wrap<'a, S, Opt>(text: &str, options: Opt) -> Vec<Cow<'_, str>>
    where
        S: WordSplitter,
        Opt: Into<Options<'a, S>>;
    

    The Into<Options<'a, S> bound allows you to pass an usize (which is interpreted as the line width) and a full Options object. This allows the new functions to work like the old, plus you can now fully customize the behavior of the wrapping via Options when needed.

    Code that call textwrap::wrap or textwrap::fill can remain unchanged. Code that calls into Wrapper::wrap or Wrapper::fill will need to be update. This is a mechanical change, please see #213 for examples.

    Thanks to @CryptJar and @Koxiat for their support in the PRs above!

    Removed Features

    • The wrap_iter and into_wrap_iter methods are gone. This means that lazy iteration is no longer supported: you always get all wrapped lines back as a Vec. This was done to simplify the code and to support the optimal-fit algorithm.

      The first-fit algorithm could still be implemented in an incremental fashion. Please let us know if this is important to you.

    Other Changes

    • #206: Change Wrapper.splitter from T: WordSplitter to Box<dyn WordSplitter>.
    • #216: Forbid the use of unsafe code.
    Source code(tar.gz)
    Source code(zip)
  • 0.12.1(Nov 24, 2020)

    This is a bugfix release.

    • Fixed #176: Mention compile-time wrapping by linking to the [textwrap-macros crate].
    • Fixed #193: Wrapping with break_words(false) was broken and would cause extra whitespace to be inserted when words were longer than the line width.
    Source code(tar.gz)
    Source code(zip)
  • 0.12.0(Nov 24, 2020)

    The code has been updated to the [Rust 2018 edition][rust-2018] and each new release of textwrap will only support the latest stable version of Rust. Trying to support older Rust versions is a fool's errand: our dependencies keep releasing new patch versions that require newer and newer versions of Rust.

    The term_size feature has been replaced by terminal_size. The API is unchanged, it is just the name of the Cargo feature that changed.

    The hyphenation feature now only embeds the hyphenation patterns for US-English. This slims down the dependency.

    • Fixed #140: Ignore ANSI escape sequences.
    • Fixed #158: Unintended wrapping when using external splitter.
    • Fixed #177: Update examples to the 2018 edition.
    Source code(tar.gz)
    Source code(zip)
  • 0.11.0(Nov 24, 2020)

    Due to our dependencies bumping their minimum supported version of Rust, the minimum version of Rust we test against is now 1.22.0.

    • Merged #141: Fix dedent handling of empty lines and trailing newlines. Thanks @bbqsrc!
    • Fixed #151: Release of version with hyphenation 0.7.
    Source code(tar.gz)
    Source code(zip)
  • 0.10.0(Nov 24, 2020)

    Due to our dependencies bumping their minimum supported version of Rust, the minimum version of Rust we test against is now 1.17.0.

    • Fixed #99: Word broken even though it would fit on line.
    • Fixed #107: Automatic hyphenation is off by one.
    • Fixed #122: Take newlines into account when wrapping.
    • Fixed #129: Panic on string with em-dash.
    Source code(tar.gz)
    Source code(zip)
  • 0.9.0(Nov 24, 2020)

    The dependency on term_size is now optional, and by default this feature is not enabled. This is a breaking change for users of Wrapper::with_termwidth. Enable the term_size feature to restore the old functionality.

    Added a regression test for the case where width is set to usize::MAX, thanks @Fraser999! All public structs now implement Debug, thanks @hcpl!

    • Fixed #101: Make term_size an optional dependency.
    Source code(tar.gz)
    Source code(zip)
  • 0.8.0(Nov 24, 2020)

    The Wrapper stuct is now generic over the type of word splitter being used. This means less boxing and a nicer API. The Wrapper::word_splitter method has been removed. This is a breaking API change if you used the method to change the word splitter.

    The Wrapper struct has two new methods that will wrap the input text lazily: Wrapper::wrap_iter and Wrapper::into_wrap_iter. Use those if you will be iterating over the wrapped lines one by one.

    • Fixed #59: wrap could return an iterator. Thanks @hcpl!
    • Fixed #81: Set html_root_url.
    Source code(tar.gz)
    Source code(zip)
  • 0.7.0(Nov 24, 2020)

    Version 0.7.0 changes the return type of Wrapper::wrap from Vec<String> to Vec<Cow<'a, str>>. This means that the output lines borrow data from the input string. This is a breaking API change if you relied on the exact return type of Wrapper::wrap. Callers of the textwrap::fill convenience function will see no breakage.

    The above change and other optimizations makes version 0.7.0 roughly 15-30% faster than version 0.6.0.

    The squeeze_whitespace option has been removed since it was complicating the above optimization. Let us know if this option is important for you so we can provide a work around.

    • Fixed #58: Add a fast_wrap function.
    • Fixed #61: Documentation errors.
    Source code(tar.gz)
    Source code(zip)
  • 0.6.0(Nov 24, 2020)

    Version 0.6.0 adds builder methods to Wrapper for easy one-line initialization and configuration:

    let wrapper = Wrapper::new(60).break_words(false);
    

    It also add a new NoHyphenation word splitter that will never split words, not even at existing hyphens.

    • Fixed #28: Support not squeezing whitespace.
    Source code(tar.gz)
    Source code(zip)
  • 0.5.0(Nov 24, 2020)

    Version 0.5.0 has breaking API changes. However, this only affects code using the hyphenation feature. The feature is now optional, so you will first need to enable the hyphenation feature as described above. Afterwards, please change your code from

    wrapper.corpus = Some(&corpus);
    

    to

    wrapper.splitter = Box::new(corpus);
    

    Other changes include optimizations, so version 0.5.0 is roughly 10-15% faster than version 0.4.0.

    • Fixed #19: Add support for finding terminal size.
    • Fixed #25: Handle words longer than self.width.
    • Fixed #26: Support custom indentation.
    • Fixed #36: Support building without hyphenation.
    • Fixed #39: Respect non-breaking spaces.
    Source code(tar.gz)
    Source code(zip)
  • 0.4.0(Nov 24, 2020)

  • 0.3.0(Nov 24, 2020)

  • 0.2.0(Nov 21, 2020)

  • 0.1.0(Nov 21, 2020)

Owner
Martin Geisler
Martin Geisler
Context-sensitive word embeddings with subwords. In Rust.

finalfrontier Introduction finalfrontier is a Rust program for training word embeddings. finalfrontier currently has the following features: Models: s

finalfusion 74 Dec 29, 2022
Text Expression Runner – Readable and easy to use text expressions

ter - Text Expression Runner ter is a cli to run text expressions and perform basic text operations such as filtering, ignoring and replacing on the c

Maximilian Schulke 72 Jul 31, 2022
👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike

Table of Contents What does this library do? Why does this library exist? Which languages are supported? How good is it? Why is it better than other l

Peter M. Stahl 569 Jan 3, 2023
A fast, low-resource Natural Language Processing and Text Correction library written in Rust.

nlprule A fast, low-resource Natural Language Processing and Error Correction library written in Rust. nlprule implements a rule- and lookup-based app

Benjamin Minixhofer 496 Jan 8, 2023
lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike.

lingua-py lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike. Installation pip install l

messense 7 Dec 30, 2022
An efficient way to filter duplicate lines from input, à la uniq.

runiq This project offers an efficient way (in both time and space) to filter duplicate entries (lines) from texual input. This project was born from

Isaac Whitfield 170 Dec 24, 2022
Text calculator with support for units and conversion

cpc calculation + conversion cpc parses and evaluates strings of math, with support for units and conversion. 128-bit decimal floating points are used

Kasper 82 Jan 4, 2023
Find and replace text in source files

Ruplacer Find and replace text in source files: $ ruplacer old new src/ Patching src/a_dir/sub/foo.txt -- old is everywhere, old is old ++ new is ever

Tanker 331 Dec 28, 2022
Semantic text segmentation. For sentence boundary detection, compound splitting and more.

NNSplit A tool to split text using a neural network. The main application is sentence boundary detection, but e. g. compound splitting for German is a

Benjamin Minixhofer 273 Dec 29, 2022
Source text parsing, lexing, and AST related functionality for Deno

Source text parsing, lexing, and AST related functionality for Deno.

Deno Land 90 Jan 1, 2023
Font independent text analysis support for shaping and layout.

lipi Lipi (Sanskrit for 'writing, letters, alphabet') is a pure Rust crate that provides font independent text analysis support for shaping and layout

Chad Brokaw 12 Sep 22, 2022
A Rust wrapper for the Text synthesization service TextSynth API

A Rust wrapper for the Text synthesization service TextSynth API

ALinuxPerson 2 Mar 24, 2022
bottom encodes UTF-8 text into a sequence comprised of bottom emoji

bottom encodes UTF-8 text into a sequence comprised of bottom emoji (with , sprinkled in for good measure) followed by ????. It can encode any valid UTF-8 - being a bottom transcends language, after all - and decode back into UTF-8.

Bottom Software Foundation 345 Dec 30, 2022
fastest text uwuifier in the west

uwuify fastest text uwuifier in the west transforms Hey... I think I really love you. Do you want a headpat? into hey... i think i w-weawwy wuv you.

Daniel Liu 1.2k Dec 29, 2022
A crate using DeepSpeech bindings to convert mic audio from speech to text

DS-TRANSCRIBER Need an Offline Speech To Text converter? Records your mic, and returns a String containing what was said. Features Begins transcriptio

null 32 Oct 8, 2022
Sorta Text Format in UTF-8

STFU-8: Sorta Text Format in UTF-8 STFU-8 is a hacky text encoding/decoding protocol for data that might be not quite UTF-8 but is still mostly UTF-8.

Rett Berg 18 Sep 4, 2022
The fastest way to identify any mysterious text or analyze strings from a file, just ask `lemmeknow` !

The fastest way to identify anything lemmeknow ⚡ Identify any mysterious text or analyze strings from a file, just ask lemmeknow. lemmeknow can be use

Swanand Mulay 594 Dec 30, 2022
better tools for text parsing

nom-text Goal: a library that extends nom to provide better tools for text formats (programming languages, configuration files). current needs Recogni

null 5 Oct 18, 2022
Makdown-like text parser.

Makdown-like text parser.

Ryo Nakamura 1 Dec 7, 2021