A HTML entity encoding library for Rust

Overview

A HTML entity encoding library for Rust

Build Status

Example usage

All example assume a extern crate htmlescape; and use htmlescape::{relevant functions here}; is present.

###Encoding htmlescape::encode_minimal() encodes an input string using a minimal set of HTML entities.

let title = "Cats & dogs";
let tag = format!("<title>{}</title>", encode_minimal(title));
assert_eq!(tag.as_slice(), "<title>Cats &amp; dogs</title>");

There is also a htmlescape::encode_attribute() function for encoding strings that are to be used as html attribute values.

###Decoding htmlescape::decode_html() decodes an encoded string, replacing HTML entities with the corresponding characters. Named, hex, and decimal entities are supported. A Result value is returned, with either the decoded string in Ok, or an error in Err.

let encoded = "Cats&#x20;&amp;&#32;dogs";
let decoded = match decode_html(encoded) {
  Err(reason) => panic!("Error {:?} at character {}", reason.kind, reason.position),
  Ok(s) => s
};
assert_eq!(decoded.as_slice(), "Cats & dogs");

###Avoiding allocations Both the encoding and decoding functions are available in forms that take a Writer for output rather than returning an String. These version can be used to avoid allocation and copying if the returned String was just going to be written to a Writer anyway.

You might also like...
A HTTP Archive format (HAR) serialization & deserialization library, written in Rust.

har-rs HTTP Archive format (HAR) serialization & deserialization library, written in Rust. Install Add the following to your Cargo.toml file: [depende

tnetstring serialization library for rust.

TNetStrings: Tagged Netstrings This module implements bindings for the tnetstring serialization format. API let t = tnetstring::str("hello world"); le

rust-jsonnet - The Google Jsonnet( operation data template language) for rust

rust-jsonnet ==== Crate rust-jsonnet - The Google Jsonnet( operation data template language) for rust Google jsonnet documet: (http://google.github.io

MessagePack implementation for Rust / msgpack.org[Rust]

RMP - Rust MessagePack RMP is a pure Rust MessagePack implementation. This repository consists of three separate crates: the RMP core and two implemen

A Rust ASN.1 (DER) serializer.

rust-asn1 This is a Rust library for parsing and generating ASN.1 data (DER only). Installation Add asn1 to the [dependencies] section of your Cargo.t

Cap'n Proto for Rust

Cap'n Proto for Rust documentation blog Introduction Cap'n Proto is a type system for distributed systems. With Cap'n Proto, you describe your data an

Rust implementation of CRC(16, 32, 64) with support of various standards

crc Rust implementation of CRC(16, 32, 64). MSRV is 1.46. Usage Add crc to Cargo.toml [dependencies] crc = "2.0" Compute CRC use crc::{Crc, Algorithm,

A CSV parser for Rust, with Serde support.

csv A fast and flexible CSV reader and writer for Rust, with support for Serde. Dual-licensed under MIT or the UNLICENSE. Documentation https://docs.r

pem-rs pem PEM jcreekmore/pem-rs [pem] — A Rust based way to parse and encode PEM-encoded data

pem A Rust library for parsing and encoding PEM-encoded data. Documentation Module documentation with examples Usage Add this to your Cargo.toml: [dep

Comments
  • hex entities are not decoded

    hex entities are not decoded

    decode_html: cannot decode "Lead Data Scientist – R&D,Machine Learning – Big Data An" reason:DecodeErr { position: 30, kind: UnknownEntity }

    Could I get a support for those? I'd expect this to be decoded to dash unicode character.

    opened by Fiedzia 1
  • html escape feature to handle quotes in custom manner

    html escape feature to handle quotes in custom manner

    Thanks for providing this library. We are using it and are building a php style html_entities support to allow custom handling of quotes and need a feature to leave quotes untouched if user wants similar to https://www.php.net/manual/en/function.htmlentities.php

    Constant Name | Description
    -- | --
    ENT_COMPAT | Will convert double-quotes and leave single-quotes alone.
    ENT_QUOTES | Will convert both double and single quotes.
    ENT_NOQUOTES | Will leave both double and single quotes unconverted.
    

    Is it possible to have this feature in this library?

    opened by surajprak 0
  • Improve encode speed by ~50%

    Improve encode speed by ~50%

    Moved entity lookup into match arms instead of using a function that does a binary search in static array, then boxes up the result, and then immediately unboxes it in the match.

    Before cargo bench

    running 4 tests
    test bench_decode_attribute ... bench:   2,585,276 ns/iter (+/- 1,174,543) = 93 MB/s
    test bench_decode_minimal   ... bench:   1,314,817 ns/iter (+/- 793,375) = 89 MB/s
    test bench_encode_attribute ... bench:   2,103,300 ns/iter (+/- 874,982) = 54 MB/s
    test bench_encode_minimal   ... bench:   1,799,386 ns/iter (+/- 889,072) = 63 MB/s
    
    test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured; 0 filtered out
    

    After cargo bench

    running 4 tests
    test bench_decode_attribute ... bench:   2,273,943 ns/iter (+/- 986,740) = 106 MB/s
    test bench_decode_minimal   ... bench:   1,292,612 ns/iter (+/- 840,655) = 91 MB/s
    test bench_encode_attribute ... bench:   1,528,778 ns/iter (+/- 725,928) = 74 MB/s
    test bench_encode_minimal   ... bench:   1,177,384 ns/iter (+/- 560,557) = 96 MB/s
    
    test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured; 0 filtered out
    
    opened by YetAnotherMinion 2
Owner
Viktor Dahl
Viktor Dahl
Astro Format is a library for efficiently encoding and decoding a set of bytes into a single buffer format.

Astro Format is a library for efficiently transcoding arrays into a single buffer and native rust types into strings

Stelar Labs 1 Aug 13, 2022
Implementation of Bencode encoding written in rust

Rust Bencode Implementation of Bencode encoding written in rust. Project Status Not in active developement due to lack of time and other priorities. I

Arjan Topolovec 32 Aug 6, 2022
Encoding and decoding support for BSON in Rust

bson-rs Encoding and decoding support for BSON in Rust Index Overview of BSON Format Usage BSON Values BSON Documents Modeling BSON with strongly type

mongodb 304 Dec 30, 2022
A Gecko-oriented implementation of the Encoding Standard in Rust

encoding_rs encoding_rs an implementation of the (non-JavaScript parts of) the Encoding Standard written in Rust and used in Gecko (starting with Fire

Henri Sivonen 284 Dec 13, 2022
Character encoding support for Rust

Encoding 0.3.0-dev Character encoding support for Rust. (also known as rust-encoding) It is based on WHATWG Encoding Standard, and also provides an ad

Kang Seonghoon 264 Dec 14, 2022
Variable-length signed and unsigned integer encoding that is byte-orderable for Rust

ordered-varint Provides variable-length signed and unsigned integer encoding that is byte-orderable. This crate provides the Variable trait which enco

Khonsu Labs 7 Dec 6, 2022
A series of compact encoding schemes for building small and fast parsers and serializers

A series of compact encoding schemes for building small and fast parsers and serializers

Manfred Kröhnert 2 Feb 5, 2022
Entropy Encoding notebook. Simple implementations of the "tANS" encoder/decoder.

EntropyEncoding Experiments This repository contains my Entropy Encoding notebook. Entropy encoding is an efficient lossless data compression scheme.

Nadav Rotem 4 Dec 21, 2022
TLV-C encoding support.

TLV-C: Tag - Length - Value - Checksum TLV-C is a variant on the traditional [TLV] format that adds a whole mess of checksums and whatnot. Why, you as

Oxide Computer Company 3 Nov 25, 2022
Rust library for reading/writing numbers in big-endian and little-endian.

byteorder This crate provides convenience methods for encoding and decoding numbers in either big-endian or little-endian order. Dual-licensed under M

Andrew Gallant 811 Jan 1, 2023