Rust library to parse mail files

Overview

mailparse

Build Status Crate

A simple parser for MIME email messages.

API

The primary entry point for this library is the following function:

    parse_mail(&[u8]) -> Result<ParsedMail, MailParseError>

This function takes the raw message data, including headers and body, and returns a structured object to more easily access pieces of the email message. There are other public functions that allow parsing smaller parts of the message as well; refer to the full documentation.

The library is designed to process real-world email data such as might be obtained by using the FETCH command on an IMAP server, or in a Maildir. As such, this library should successfully handle any valid MIME-formatted message, although it may not follow all the strict requirements in the various specifications that cover the format (predominantly IETF RFCs 822, 2045, 2047, 2822, and 5322). As an example, this library accepts raw message data which uses \n (ASCII LF) as line delimiters rather than the RFC-mandated \r\n (ASCII CRLF) line delimiters.

Example usage

    use mailparse::*;
    let parsed = parse_mail(concat!(
            "Subject: This is a test email\n",
            "Content-Type: multipart/alternative; boundary=foobar\n",
            "Date: Sun, 02 Oct 2016 07:06:22 -0700 (PDT)\n",
            "\n",
            "--foobar\n",
            "Content-Type: text/plain; charset=utf-8\n",
            "Content-Transfer-Encoding: quoted-printable\n",
            "\n",
            "This is the plaintext version, in utf-8. Proof by Euro: =E2=82=AC\n",
            "--foobar\n",
            "Content-Type: text/html\n",
            "Content-Transfer-Encoding: base64\n",
            "\n",
            "PGh0bWw+PGJvZHk+VGhpcyBpcyB0aGUgPGI+SFRNTDwvYj4gdmVyc2lvbiwgaW4g \n",
            "dXMtYXNjaWkuIFByb29mIGJ5IEV1cm86ICZldXJvOzwvYm9keT48L2h0bWw+Cg== \n",
            "--foobar--\n",
            "After the final boundary stuff gets ignored.\n").as_bytes())
        .unwrap();
    assert_eq!(parsed.headers.get_first_value("Subject"),
        Some("This is a test email".to_string()));
    assert_eq!(parsed.subparts.len(), 2);
    assert_eq!(parsed.subparts[0].get_body().unwrap(),
        "This is the plaintext version, in utf-8. Proof by Euro: \u{20AC}");
    assert_eq!(parsed.subparts[1].headers[1].get_value(), "base64");
    assert_eq!(parsed.subparts[1].ctype.mimetype, "text/html");
    assert!(parsed.subparts[1].get_body().unwrap().starts_with("<html>"));
    assert_eq!(dateparse(parsed.headers.get_first_value("Date").unwrap().as_str()).unwrap(), 1475417182);

Documentation

See the rustdoc at docs.rs.

Support mailparse

If you want to support development of mailparse, please do so by donating your money, time, and/or energy to fighting climate change. A quick and easy way is to send a donation to Replant.ca Environmental, where every dollar gets a tree planted!

Comments
  • Failure to parse RFC2047 encoded display-names in email addresses

    Failure to parse RFC2047 encoded display-names in email addresses

    i am passing =?utf-8?q?=C3=A4_space?= <[email protected]> to mailparse::addrparse but it doesn't parse me out the "ä space" display name. The displayname is valid RFC2047-encoding as far as i can tell. See https://tools.ietf.org/html/rfc2047#section-8 for examples that mirror what i am trying to do.

    opened by hpk42 12
  • unexpected postfix

    unexpected postfix "\r" in MailHeader's value field

    e.g.,

    "X-IncomingHeaderCount: 10\r\n"
    

    would be parsed to

    
    MailHeader {
                key: "X-IncomingHeaderCount",
                value: "10\r",
    }
    
    
    opened by qsdgy 10
  • Allow recovery of raw headers

    Allow recovery of raw headers

    I'd love to be able to use this library to parse, modify headers, and send onward messages.

    Would you be open to a PR that adds get_value_raw() -> &[u8] to MailHeader ? I think that's sufficient for my case.

    opened by aredridel 9
  • An accessor for the whole headers, as sent

    An accessor for the whole headers, as sent

    Reading the source, this naming makes sense — encoded meaning "as sent", raw meaning "before character set changes"; without looking at the source, I find it a bit inscrutable. I've matched the naming of get_body_encoded() when implementing get_heades_encoded() as mentioned in #75.

    It might be worth clarifying some of this naming if there's a major version bump, but I hope this is acceptable as it is for now.

    opened by aredridel 8
  • How to get the mail body untouched

    How to get the mail body untouched

    When the body is encoded in base64, the get_body_raw() method decodes it before it is the returned. Is there a way to get the untouched body instead? I need to forward to another system an attachment encoded in base64, at the moment the only way to achieve it is to call the get_body_raw() and then encode the result in base64; anyway, both the decode and the following encode actions are expensive and completely useless.

    opened by ufoscout 8
  • Unexpected newline in header key

    Unexpected newline in header key

    Hi!

    I've got "Unexpected newline in header key" on several emails from my mailbox. I did a little research and have found "\r\nContent-Type..." input to parse_header func. There's a really dirty workaround, but it works:

    pub fn parse_header(raw_data1: &[u8]) -> Result<(MailHeader, usize), MailParseError> {
        let mut index = 0;
        while raw_data1[index] == 13 || raw_data1[index] == 10 {
            index += 1;
        }
    
        let raw_data = &raw_data1[index..];
    
        let mut it = raw_data.iter();
    // ...
                    key: &raw_data[0..v],
                    value: &raw_data[ix_value_start..ix_value_end],
                },
                    ix + index))
            }
    
    

    How to fix it correctly?

    opened by ivs 8
  • Cannot use parse_mail() to parse mails from File because of lifetimes

    Cannot use parse_mail() to parse mails from File because of lifetimes

    Doing

    pathes.iter()
        .map(|path| {
            let mut buf;
            File.open(path).read_to_end(&mut buf);
            parse_mail(&buf)
        })
    

    is not possible because the lifetime of the result of parse_mail() requires the buf to outlife the function.

    Can we have an interface which does not require this, please? It is really hard to use this crate otherwise.

    opened by matthiasbeyer 8
  • Fail on `Content-Disposition` header values that aren’t ASCII

    Fail on `Content-Disposition` header values that aren’t ASCII

    See https://tools.ietf.org/html/rfc2045, https://tools.ietf.org/html/rfc2045 and https://tools.ietf.org/html/rfc6266#appendix-C.

    Ideally the full grammar should be obeyed, rejecting control characters too.

    opened by sanmai-NL 7
  • Encrypted and signed multiparts

    Encrypted and signed multiparts

    With the parts iterator (#108) we can now iterate over subparts of a parsed mail. But there is a problem with multipart/encrypted and multipart/signed parts: indeed, once decrypted (or verified), an encrypted (or signed) part can contain new subparts that are missed by the iterator. It could be really nice to pass a "decrypter", this way every time the iterator meets an encrypted or signed part it pushes new subparts to itself. At the same time I find this option a bit out of the scope of your library (maybe I am wrong?).

    I tried to wrap your iterator with a custom one but I have big issues with ownership (indeed, I cannot mailparse::parse_mail on the fly, data does not live long enough).

    How would you fix this issue?

    opened by soywod 6
  • MailHeader::get_value is adding spaces where it should not

    MailHeader::get_value is adding spaces where it should not

    Hello,

    The comment over the get_value function says that the parser should get rid of the extra whitespace introduced by MIME for multiline headers. However I think there is a problem in the implementation. Parsing around 2000 mails with this nice library made me find this problem when using multiline UTF-8 subject:

    extern crate mailparse;
    
    use mailparse::parse_header;
    
    
    fn main() {
        let raw = b"Subject: =?utf-8?q?this_is_?=\n\t=?utf-8?q?a_test?=";
        let (parsed, _) = parse_header(raw).unwrap();
    
        let key = parsed.get_key().unwrap();
        let value = parsed.get_value().unwrap();
    
        println!("{}: {}", key, value);
    
        assert_eq!(key, "Subject");
        assert_eq!(value, "this is a test");
    }
    

    And the output is:

    Subject: this is  a test
    thread 'main' panicked at 'assertion failed: `(left == right)`
      left: `"this is  a test"`,
     right: `"this is a test"`', src/main.rs:16:5
    note: Run with `RUST_BACKTRACE=1` for a backtrace.
    

    An extra whitespace is added, where it should not.

    I can propose patch if you indicate me what to change. I am not very familiar with the RFC but I think this is a bug because a lot of mails I parse have this extra-whitespace problem in subjects.

    opened by MicroJoe 6
  • Add a PartsIterator

    Add a PartsIterator

    While using your (great) lib, I had the need to iterate over parts and I came up with a PartsIterator. I thought you could be interested in it and potentially add it to your lib (it must be a common case to iterate over parts). Let me know!

    opened by soywod 5
Owner
Kartikaya Gupta (kats)
Kartikaya Gupta (kats)
Fast and robust e-mail parsing library for Rust

mail-parser mail-parser is an e-mail parsing library written in Rust that fully conforms to the Internet Message Format standard (RFC 5322), the Multi

Stalwart Labs 158 Jan 1, 2023
E-mail delivery library for Rust with DKIM support

mail-send mail-send is a Rust library to build, sign and send e-mail messages via SMTP. It includes the following features: Generates e-mail messages

Stalwart Labs 165 Oct 23, 2023
A mail suite written in rust meant to be easy to use.

Erooster A mail suite written in rust meant to be easy to use. Getting started Currently the setup is quite rough. You need some certificates for your

Marcel 33 Dec 19, 2022
Unofficial Rust library for the SendGrid API

sendgrid-rs Unofficial Rust library for the SendGrid API. This crate requires Rust 1.15 or higher as it uses a crate that has a custom derive implemen

Garrett Squire 88 Dec 27, 2022
a mailer library for Rust

lettre A mailer library for Rust NOTE: this readme refers to the 0.10 version of lettre, which is still being worked on. The master branch and the alp

lettre 1.3k Jan 4, 2023
An ESMTP server library written in Rust.

rs-smtp An ESMTP server library written in Rust. Features ESMTP client & server implementing RFC 5321 Support for SMTP AUTH and PIPELINING UTF-8 suppo

DUNEF 3 Apr 15, 2023
A small unofficial library to send emails using Sendgrid.

sendgrid_thin A thin wrapper around the SendGrid V3 API. It does not use the crate tokio or hyper and is therefore very lightweight and do not interfe

Reinaldo Rozato Junior 3 Nov 17, 2022
新しい IMAP client in Rust

新しい IMAP client 新しい (atarashii/new) IMAP client in Rust. It supports plain and secure connections. In progress It's under development... Usage Put thi

Alex Maslakov 39 Sep 13, 2020
Implementation of mjml in rust

MRML Introduction This project is a reimplementation of the nice MJML markup language in Rust. How to use it use mrml; fn main() { match mrml::to

Jérémie Drouet 228 Dec 28, 2022
Rust implementation of catapulte email sender

Catapulte What is catapulte? Catapulte is an open source mailer you can host yourself. You can use it to quickly catapult your transactionnal emails t

Jérémie Drouet 108 Dec 14, 2022
📫Himalaya: CLI email client written in Rust.

??Himalaya: CLI email client written in Rust.

Clément DOUIN 2.1k Jan 7, 2023
Check if an email address exists without sending any email, written in Rust.

Check if an email address exists without sending any email, written in Rust.

Reacher 3.5k Dec 31, 2022
A rewrite of the server side parts of emersion/go-smtp package into rust.

rust-smtp-server A rust smtp server library. It's mainly a rewrite of the server side parts of the emersion/go-smtp library. Features Usage Add this t

Nick Westendorf 3 Apr 26, 2023
Rust library to parse mail files

mailparse A simple parser for MIME email messages. API The primary entry point for this library is the following function: parse_mail(&[u8]) -> Re

Kartikaya Gupta (kats) 150 Dec 27, 2022
A Rust library to parse Blueprint files and convert them into GTK UI files

?? gtk-ui-builder A Rust library to parse Blueprint files and convert them into GTK UI files Inspired by the Blueprint project Example 1 - blueprints

Observer KRypt0n_ 5 Oct 22, 2022
Fast and robust e-mail parsing library for Rust

mail-parser mail-parser is an e-mail parsing library written in Rust that fully conforms to the Internet Message Format standard (RFC 5322), the Multi

Stalwart Labs 158 Jan 1, 2023
E-mail delivery library for Rust with DKIM support

mail-send mail-send is a Rust library to build, sign and send e-mail messages via SMTP. It includes the following features: Generates e-mail messages

Stalwart Labs 165 Oct 23, 2023
A static mail HTML archive for the 21st century, written in Rust

?? Crabmail ?? self-hosted / github mirror A static mail HTML archive for the 21st century, written in Rust. Includes helpful "modern" features that e

Alex Wennerberg 18 Oct 11, 2022
A mail suite written in rust meant to be easy to use.

Erooster A mail suite written in rust meant to be easy to use. Getting started Currently the setup is quite rough. You need some certificates for your

Marcel 33 Dec 19, 2022
Anglosaxon is a command line tool to parse XML files using SAX

anglosaxon - Convert large XML files to other formats anglosaxon is a command line tool to parse XML files using SAX. You can do simple transformation

Amanda 8 Oct 7, 2022