A Rust OpenType manipulation library

Overview

fonttools-rs   Build Status Latest Version Docs badge

This is an attempt to write an Rust library to read, manipulate and write TTF/OTF files. It is in the early stages of development. Contributions are welcome.

Example usage

use fonttools::font::{self, Font, Table};
use fonttools::name::{name, NameRecord, NameRecordID};

// Load a font (tables are lazy-loaded)
let fontfile = File::open("Test.otf").unwrap();
use std::fs::File;
let mut myfont = font::load(fontfile).expect("Could not load font");

// Access an existing table
if let Table::Name(name_table) = myfont.get_table(b"name")
        .expect("Error reading name table")
        .expect("There was no name table") {
        // Manipulate the table (table-specific)
        name_table.records.push(NameRecord::windows_unicode(
            NameRecordID::LicenseURL,
            "http://opensource.org/licenses/OFL-1.1"
        ));
}
let mut outfile = File::create("Test-with-OFL.otf").expect("Could not create file");
myfont.save(&mut outfile);

See the documentation for more details, and the fonttools-cli utilities (installable via cargo install fonttools_cli) for code examples.

License

Apache-2

Comments
  • tags and table access

    tags and table access

    Playing around with this and I'm finding that the table access stuff is a bit of a headache, and I have a few thoughts for discussion:

    Tags

    Having tag be a typealias for [u8; 4] works pretty well, but it has the one annoyance that tags can't be printed, because we can't impl Display for a typealias.

    There is also the secondary issue that it is possible to construct tags that contain characters that are not permitted (outside the range 0x20-0x7E) but I don't think this is a serious concern in practice.

    If we want Display/Debug (which would be nice, lots of things are going to want to print tags) then we'll need a newtype, like:

    struct Tag([u8; 4]);
    

    but then we have the annoyance that every use of a tag will need to involve constructing one of these, like:

    let GSUB = font.get_table(Tag::new("GSUB").unwrap());
    

    and that's also annoying.

    One alternative would be to have a bunch of consts for common tags, like:

    pub mod tags {
        pub mod tables {
            const hhea: Tag = Tag(b"hhea");
            const head: Tag = Tag(b"head");
            // etc
        }
        pub mod features {
            // etc
        }
    }
    

    And then you might do things like,

    let GPOS = font.get_table(tags::tables::GPOS);
    

    and that feels kind of okay, but ultimately I don't love any of this.

    Tables

    Table access is a bit annoying, because of the big Table enum. This has been worked around a bit by the table_unchecked! macro, but I wonder if we can make this a bit easier?

    my thinking here would be to add a new Tables struct, and give it a bunch of convenience methods for the common tables, e.g:

    struct Tables(BTreeMap<Tag, Table>);
    
    impl Tables {
        pub fn gpos(&mut self) -> Result<Option<&GPOS>, DeserializationError> { };
       // ... etc
    }
    

    And then there could be a fallback method for getting an unknown table, that would just return bytes.

    Doing this would also mean we wouldn't need to interact with raw tags as often.

    opened by cmyr 24
  • consider moving off of nightly

    consider moving off of nightly

    There aren't a lot of great reasons to require nightly rust anymore, and it will be a major barrier to the adoption of this crate elsewhere in the ecosystem. Unless there's a really compelling reason to require nightly, I think it probably makes sense to try and move off of it sooner rather than later, as it will become harder the more nightly features end up being used.

    Any thoughts on this? Last I checked nightly was only being used to improve diagnostics in some derive macros, and there are reasonable workarounds there. Are there any other major motivations?

    It's worth keeping in mind that we can still conditionally use nightly features, enabling them only when we're on a nightly compiler.

    opened by cmyr 9
  • fontcrunch is slow because I can't handle references and lifetimes

    fontcrunch is slow because I can't handle references and lifetimes

    The ported version of fontcrunch is much slower than the original, particularly in glyphs which go down this O(n^2) loop:

    https://github.com/simoncozens/fonttools-rs/blob/0400715da315aa783c57070da2897760694b1fc3/crates/fonttools-cli/src/bin/fontcrunch.rs#L322-L332

    I'm sure it's because of this clone:

    https://github.com/simoncozens/fonttools-rs/blob/0400715da315aa783c57070da2897760694b1fc3/crates/fonttools-cli/src/bin/fontcrunch.rs#L366

    The clone is only there because I can't work out the lifetimes required for storing Statelets as references within a Statelet and moving them around. If I put what I think are the right lifetimes on the State and Statelet types, I get a confusing error like so:

    431 |     states: &mut Vec<State>,
        |             --------------- these two types are declared with different lifetimes...
    ...
    454 |         states[this].sts.push(sl);
        |                               ^^ ...but data from `states` flows into `states` here
    
    opened by simoncozens 8
  • Tag type and macro

    Tag type and macro

    Okay so I ended up just going and doing this with a proc macro.

    I was reluctant to go this route because generally adding a proc macro for something this minor is bad, because normally you don't already have proc macros, and so you're bringing in a bunch of new dependencies, and it can significantly impact compile time. Since we're already building otspec_macros, however, adding one more doesn't feel that bad.

    So this adds a Tag type (struct Tag([u8; 4])) and a tag! macro, that takes a &str literal, and verifies at compile time that it is a valid tag.

    • 1-4 characters long (we add padding spaces if it's shorter, so tag!("URD") == b"URD ".
    • all bytes in the range 0x20..=0x7E

    If either of those things is false, then code won't compile.

    The main advantage to this approach is that you can print a tag and know it's valid utf-8, and that all tags are generated statically at compile time. (If you need to create one at runtime you can do Tag::from_raw("hi").unwrap()).

    Overall I think this is... okay? it works, and is clear. I'm annoyed that it ended up being so much code, especially since I would like to make it much easier to grab commonly used tables, which would let me delete a lot of these changes.

    One note: there's currently a weird thing where the macro is re-exported from otspec::types, which is itself reexported at fonttools root. This means that if you use fonttools::types::!("hi") it will be a compiler error if otspec is not in your Cargo.toml. I'm not sure what the best solution is for this (probably just only exporting the macro via fonttools?) but I'm going to leave it as-is for now, because I want to go for a walk before the sun goes down.

    cheers!

    opened by cmyr 5
  • Consider ripping out serde and doing our own ser/deserialization

    Consider ripping out serde and doing our own ser/deserialization

    serde is really good, and really helpful. It's got us very far very quickly. But it has some significant frustrations, most notably the fact that it's not designed for random access.

    Currently any structure that involves offsets (and now we're coming into handling layout, there are lots of them) requires a nasty process of reading the offset data, reading the rest of the structure, calculating how many bytes we've consumed so far in the structure, reading the rest of the available data into a vec, making a new slice starting of the available data from (offset - where we are), and passing that new slice to a separate deserializer. Serializing structs with offsets is even harder, and involves pulling together (possibly nested) serializations and computing their length, etc. The worse thing is that this all has to be done by hand for each structure. If we had true random access, most of the manual ser/de implementations would go away.

    Realistically, we don't need the structure/data format abstraction that serde provides, as we are hardly likely to be serializing fonts to JSON or whatever. (I've tried it, it doesn't work...) So perhaps a custom ser/de would even be simpler.

    Would be interested to hear opinions on this, particularly from @cmyr @raphlinus.

    opened by simoncozens 5
  • decide on naming conventions

    decide on naming conventions

    Currently this crate adopts the general position of naming things based as closely as possible on the OpenType spec. This means, for example, having a fonttools::GPOS module, within which there is a GPOS struct.

    Rust has fairly clear naming conventions; for instance Rust modules are expected to be lowercase, and Rust structs are expected to be CamelCase.

    I understand the rationale for the current choice, but I'm finding it awkward. In particular it can be hard to distinguish a module from a struct.

    My preference would be to follow Rust's naming conventions, and to use namespaces where necessary, at least for tables. We know what the common table types are, and afaik there are no places where case normalization would cause a conflict.

    Suggestion:

    • moveall table modules into a newtable` module, and give them lowercase names
    • give all table structs names like Base, Gpos etc, and reexport them from tables
    • prefer to use these items with a qualified name, like table::Gpos.
    opened by cmyr 4
  • Change Win EncodingID to 1

    Change Win EncodingID to 1

    This matches fontmake and it's also recommended in the ms spec

    When building a Unicode font for Windows, the platform ID should be 3 and the encoding ID should be 1, and the referenced string data must be encoded in UTF-16BE.

    https://docs.microsoft.com/en-us/typography/opentype/spec/name#platform-specific-encoding-and-language-ids-windows-platform-platform-id-3

    opened by m4rc1e 4
  • New table access API.

    New table access API.

    This includes the commit from #54; if we don't want that PR (or want some alternative form) I'll rework this accordingly.

    This is the impl of a new API for accessing tables.

    I haven't fixed the various other projects & crates yet, because I thought it might be helpful to open this for discussion while I plod through getting CI to pass.

    Highlights

    • there's a new TableSet type that coordinates storing and loading tables.
    • tables are lazily loaded on first access, and we use interior mutability so that table getters don't require &mut self.
    • all tables are stored behind shared pointers, so you never need to hold on to a borrow of the TableSet or Font.
    • tables have transparent copy-on-write semantics: after a table has been retrieved from a font, the first time it is mutated we do a deep clone of the source table. (this is maybe a bit controversial, it's a pattern I haven't seen before and is a bit 'magical')
    • this means that you cannot mutate a table "in place"; you need to retrieve it, modify it, and then insert it back into the font.
    • access to known tables is provided via methods on TableSet, with the signature,
      fn GPOS(&self) -> Result<Option<CowPtr<GPOS>>, DeserializationError> { .. }
      
    • for unknown tables there is still a get method

    There are a few rough edges.

    • It's annoying that we need to always return Result<Option<T>>. I've been daydreaming about having two forms of Font, one which has been preloaded and one which hasn't; methods on the former would be infallible, and maybe we could also do things like guarantee that all of the required tables are present? I don't want to worry about this too much right now, but wanted to at least mention it.

    • I'm also not totally sure about the process of mutating tables and mutating fonts; I can imagine a different API where there Font is read-only, and if you want to mutate you have to use an explicit FontBuilder method, and then there would be a to_builder() method on Font for constructing that; you would then add or remove tables as desired before calling a build() method that would give you a new font. This would be slightly more cumbersome but would also be clearer with regards to the fact that if you get a table and then mutate it you need to put it back in the font for anything to change?

    Anyway, overall I think this should be quite a bit easier to use.

    Simon if you'd like to chat about this at some point let me know!

    opened by cmyr 3
  • Replace 'serde' with 'otspec'

    Replace 'serde' with 'otspec'

    This started out with trying to fix a warning about an invalid macro attribute, and then in the process I ended up confused that we were using 'serde' as our macro attribute keyword.

    opened by cmyr 3
  • Tweak load/save API

    Tweak load/save API

    • Add Font::load and Font::from_bytes methods
    • Change Font::save to take a path instead of a handle

    This is hopefully a simpler API, not requiring the user to mess around with files directly.

    I'm not sure if there are uses for font::load that are not covered by the new methods, so I have left it in place but marked it as deprecated; if it should be removed I'm happy to make that change.

    opened by cmyr 3
  • Calculate unicode ranges

    Calculate unicode ranges

    Super wip. My implementation is loosely based on https://github.com/fonttools/fonttools/blob/main/Lib/fontTools/ttLib/tables/O_S_2f_2.py. The fonttools version uses python's built in bisect module in order to achieve a good run time. I don't think Rust has this so I've opted roll my own custom binary search instead.

    Note this is still super wip. Before I plumb it together, I'll ask for a review. It should be ready for a first look tomorrow.

    opened by m4rc1e 3
  • [fontcrunch] ZenOldMincho example

    [fontcrunch] ZenOldMincho example

    I compared crunching ZenOldMincho-Regular.ttf using:

    • pycrunch i.e. font-crunch from pip
    • runcrunch i.e. fontcrunch from this repo

    Results: ZenOldMincho-pycrunch-rucrunch.zip

    pycrunch

    Updated maxp (important, perhaps) and head (modified date and checksum) entries, correct appearance:

    scr- 2021-12-14 at 21 33 15 pycrunch-head-update pycrunch-maxp-update

    rucrunch

    Only updated head checksum, but the result has buggy appearance

    scr- 2021-12-14 at 21 33 20 rucrunch-head-update
    opened by twardoch 0
  • What goes in fonttools::tables versus otspec::tables?

    What goes in fonttools::tables versus otspec::tables?

    Is there an overarching idea, here? Noticing that for head the table itself is declared in otspec, but in hhea it's declared in fonttools. I'd like to rebase the vhea branch but not sure where it should go.

    opened by cmyr 2
  • Version16

    Version16

    Okay while investigating #62 (which turned out not to be a bug, I was just holding it wrong) I ended up down this rabbit hole.

    My intention was for this to be a simplification: whether or not this has been successful is left up to the reader.

    opened by cmyr 0
  • serialization of Version16Dot16 is incorrect

    serialization of Version16Dot16 is incorrect

    for instance in hhea, the version (major: 1, minor: 1) serializes to 0x00010001 instead of 0x00011000.

    I was working on a simple F2DOT14 type since it's used elsewhere and all the crates I've seen are pretty heavyweight, so I can do this at the same time? it seems like the rules are slightly different.

    opened by cmyr 2
  • serialize feature params

    serialize feature params

    It looks like currently we skip any feature params. From a quick skim it looks like there's something funny going on and you maybe punted on it initially? Want to make sure I'm not missing anything.

    opened by cmyr 2
Owner
Simon Cozens
Simon Cozens
An XPath library in Rust

SXD-XPath An XML XPath library in Rust. Overview The project is broken into two crates: document - Basic DOM manipulation and reading/writing XML from

Jake Goulding 107 Nov 11, 2022
An XML library in Rust

xml-rs, an XML library for Rust Documentation xml-rs is an XML library for Rust programming language. It is heavily inspired by Java Streaming API for

Vladimir Matveev 417 Dec 13, 2022
Rust high performance xml reader and writer

quick-xml High performance xml pull reader/writer. The reader: is almost zero-copy (use of Cow whenever possible) is easy on memory allocation (the AP

Johann Tuffe 802 Dec 31, 2022
A XML parser written in Rust

RustyXML Documentation RustyXML is a namespace aware XML parser written in Rust. Right now it provides a basic SAX-like API, and an ElementBuilder bas

null 97 Dec 27, 2022
serde-like serialization and deserialization of static Rust types in XML

static-xml static-xml is a serde-like serialization and deserialization library for XML, currently written as a layer on top of xml-rs. Status: in ear

Scott Lamb 8 Nov 22, 2022
A Rust library for PDF document manipulation.

lopdf A Rust library for PDF document manipulation. Example Code Create PDF document #[macro_use] extern crate lopdf; use lopdf::{Document, Object, St

Junfeng Liu 1.1k Dec 30, 2022
PNG manipulation library.

pngmanip A simple rust library for parsing and manipulating PNG images, primarily at the chunk level. The intended use case was for solving PNG based

Sam Leonard 1 Jan 7, 2022
A library that allows for the arbitrary inspection and manipulation of the memory and code of a process on a Linux system.

raminspect raminspect is a crate that allows for the inspection and manipulation of the memory and code of a running process on a Linux system. It pro

Liam Germain 24 Sep 26, 2023
Fast DNA manipulation for Python, written in Rust.

quickdna Quickdna is a simple, fast library for working with DNA sequences. It is up to 100x faster than Biopython for some translation tasks, in part

Secure DNA 22 Dec 31, 2022
Provide types for angle manipulation in rust.

angulus Provides types for angle manipulation. Features serde : Serialization/deserialization support via serde. Example use angulus::{*, units::*};

Tristan Guichaoua 2 Sep 2, 2022
Sudoku Solver using bitmasks and bit-manipulation with Rust 🦀 and egui 🎨

sudoku-solver Download This Rust application implements a very memory efficent algorithm to solve sudoku and lets the user know when a unique solution

cameron 24 Apr 10, 2023
A fast, powerful, flexible and easy to use open source data analysis and manipulation tool written in Rust

fisher-rs fisher-rs is a Rust library that brings powerful data manipulation and analysis capabilities to Rust developers, inspired by the popular pan

Syed Vilayat Ali Rizvi 5 Aug 31, 2023
A fast, powerful, flexible and easy to use open source data analysis and manipulation tool written in Rust

fisher-rs fisher-rs is a Rust library that brings powerful data manipulation and analysis capabilities to Rust developers, inspired by the popular pan

null 5 Sep 6, 2023
This CLI utility facilitates effortless manipulation and exploration of TOML, YAML, JSON and RON files.

???????? This CLI utility facilitates effortless manipulation and exploration of TOML, YAML, JSON and RON files.

Moe 3 Apr 26, 2023
A dataframe manipulation tool inspired by dplyr and powered by polars.

dply is a command line tool for viewing, querying, and writing csv and parquet files, inspired by dplyr and powered by polars. Usage overview A dply p

null 14 May 29, 2023
Rust 核心库和标准库的源码级中文翻译,可作为 IDE 工具的智能提示 (Rust core library and standard library translation. can be used as IntelliSense for IDE tools)

Rust 标准库中文版 这是翻译 Rust 库 的地方, 相关源代码来自于 https://github.com/rust-lang/rust。 如果您不会说英语,那么拥有使用中文的文档至关重要,即使您会说英语,使用母语也仍然能让您感到愉快。Rust 标准库是高质量的,不管是新手还是老手,都可以从中

wtklbm 493 Jan 4, 2023
Rust library for build scripts to compile C/C++ code into a Rust library

A library to compile C/C++/assembly into a Rust library/application.

Alex Crichton 1.3k Dec 21, 2022
Rust Imaging Library's Python binding: A performant and high-level image processing library for Python written in Rust

ril-py Rust Imaging Library for Python: Python bindings for ril, a performant and high-level image processing library written in Rust. What's this? Th

Cryptex 13 Dec 6, 2022
The gRPC library for Rust built on C Core library and futures

gRPC-rs gRPC-rs is a Rust wrapper of gRPC Core. gRPC is a high performance, open source universal RPC framework that puts mobile and HTTP/2 first. Sta

TiKV Project 1.6k Jan 7, 2023
A µTP (Micro/uTorrent Transport Library) library implemented in Rust

rust-utp A Micro Transport Protocol library implemented in Rust. API documentation Overview The Micro Transport Protocol is a reliable transport proto

Ricardo Martins 134 Dec 11, 2022