Coder Social home page Coder Social logo

eml-parse-js's Introduction

Hi there 👋

👨‍🚒 关于我

  • ⚡ 我是一个喜欢旅行的栈工~
  • 😄 最近刚开始沉淀文章

🔥 我的项目

✨ 今年汇总

eml-parse-js's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

eml-parse-js's Issues

Displaying inline attachments

How to display inline attachments using readEml? Currently displaying as a broken image with the image and the inline attachment in the attachment array

Gap between `Content-type` and rest of headers leads to body being decoded as an attachment

Not sure why this happened but recieved an eml file with the content-type being separated from the rest of the headers. This has caused the body to be decoded as an attachment.

Eg. this email

MIME-Version: 1.0
Date: Thu, 29 Sep 2022 12:22:20 +0100
Message-ID:
  <CAGFso0R6WbMomMx6mFFJzt_wiL8wRm3sN0YQwXz12Ugbt72XSw@mail.gmail.com>
Subject: A subject with spaces
From: =?UTF-8?Q?Nobody_there?= <[email protected]>
To: =?UTF-8?Q?Nobody_here?= <[email protected]>



Content-type: multipart/related; 
	Boundary="0__=4EBB03D9DFA98AD48f9e8a93df938690918c4EBB03D9DFA98AD4"

--0__=4EBB03D9DFA98AD48f9e8a93df938690918c4EBB03D9DFA98AD4
Content-type: multipart/alternative; 
	Boundary="1__=4EBB03D9DFA98AD48f9e8a93df938690918c4EBB03D9DFA98AD4"

--1__=4EBB03D9DFA98AD48f9e8a93df938690918c4EBB03D9DFA98AD4
Content-transfer-encoding: quoted-printable
Content-type: text/plain; charset=ISO-8859-1



This is a test
=

--1__=4EBB03D9DFA98AD48f9e8a93df938690918c4EBB03D9DFA98AD4
Content-transfer-encoding: quoted-printable
Content-type: text/html; charset=ISO-8859-1
Content-Disposition: inline

<html><body>
<p><font size=3D"2" face=3D"sans-serif">This is a test</font><br>
</body></html>=


--1__=4EBB03D9DFA98AD48f9e8a93df938690918c4EBB03D9DFA98AD4--


--0__=4EBB03D9DFA98AD48f9e8a93df938690918c4EBB03D9DFA98AD4
Content-type: image/gif; 
	name="ecblank.gif"
Content-Disposition: inline; filename="ecblank.gif"
Content-ID: <1__=4EBB03D9DFA98AD48f9e8a93df938690918c4EBB03D9DFA98AD4@gmail.com>
Content-Transfer-Encoding: base64

R0lGODlhEAABAIAAAAAAAP///yH5BAEAAAEALAAAAAAAAAAQAAEAAAIEjI8ZBQA7


--0__=4EBB03D9DFA98AD48f9e8a93df938690918c4EBB03D9DFA98AD4--


Results in no Content-type in headers from readEml and the remainder of eml file as the data in a single attachment

Unprintable whitespace in email names

When trying to print the name associated with email sender or receiver, if there is any white space in the name it is represented using /u0014 which is unprintable and displays as a block character in chrome and nothing in firefox

Email sender and receiver are both gmail

Also seen in email subject

Parsing fails if there is no gap between headers and body

If there is no gap between the headers and the body then the parsing will not work

Have seen this with a multipart email where the boundary is directly under the headers - looks like a similar scenario was encountered before as I see this in the codebase
//Updated on 2019-10-12: A line before the boundary marker is not required to be an empty line //if (lines[i - 1] == "" && line.indexOf("--" + findBoundary) == 0 && !/\-\-(\r?\n)?$/g.test(line)) {

Contributing guide/tests

@MQpeng @c-harding
Want to start using this library, and contributing but not really sure how to ensure my changes do not break anything

Would it be possible to make a contribution guide and write some tests? I'd be happy to help out with writing some of the tests

Subjects containing underscores are not correctly decoded

Hi, I have an email containing the subject line =?UTF-8?q?Off-The-Beaten-Path_Trails_You've_Never_Heard_Of!__=F0=9F=8C=8F?=. This is decoded by my email client (Mail.app on macOS), as well as by the npm library rfc2047 as

Off-The-Beaten-Path Trails You've Never Heard Of! 🌏

However, this library instead returns

Off-The-Beaten-Path_Trails_You've_Never_Heard_Of!__🌏

i.e. underscores are not successfully decoded into spaces. The relevant section of the standard is RFC-2047, section 4.2.2:

  1. Encodings
    4.2. The "Q" encoding
    (2) The 8-bit hexadecimal value 20 (e.g., ISO-8859-1 SPACE) may be represented as "_" (underscore, ASCII 95.). (This character may not pass through some internetwork mail gateways, but its use will greatly enhance readability of "Q" encoded data with mail readers that do not support this encoding.) Note that the "_" always represents hexadecimal 20, even if the SPACE character occupies a different code position in the character set in use.

Fix

I believe a fix would involve updating mimeDecode as follows:

         if (chr === '=' && hex && /[\da-fA-F]{2}/.test(hex)) {
             buffer[bufferPos++] = parseInt(hex, 16);
             i += 2;
         }
+        else if (chr === '_') {
+            buffer[bufferPos++] = 20; // space character
+        }
         else {
             buffer[bufferPos++] = chr.charCodeAt(0);
         }

If body is marked with `Content-Disposition: inline` then the body is parsed as an attachment

Body is parsed as an attachment - I think because of the Content-Disposition: inline line

Date: Sun, 23 Apr 2023 00:23:08 +0200
Message-ID:
  <CAGFso0R6WbMomMx6mFFJzt_wiL8wRm3sN0YQwXz12Ugbt72XSw@mail.gmail.com>
Subject: A subject with spaces
From: =?UTF-8?Q?Nobody_there?= <[email protected]>
To: =?UTF-8?Q?Nobody_here?= <[email protected]>
Content-Type: multipart/alternative; boundary="=_ohFeHkpibN03-stIVrreU01"



--=_ohFeHkpibN03-stIVrreU01
Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes
Content-Disposition: inline
Content-Description: Message en texte brut

Test

--=_ohFeHkpibN03-stIVrreU01
Content-Type: text/html; charset=utf-8
Content-Disposition: inline
Content-Description: Message HTML

<div>Test</div>
--=_ohFeHkpibN03-stIVrreU01--

Output

{
  date: "2023-04-22T22:23:08.000Z",
  subject: "A subject with spaces",
  from: {
    name: "Nobody there",
    email: "[email protected]",
  },
  to: {
    name: "Nobody here",
    email: "[email protected]",
  },
  headers: {
    Date: "Sun, 23 Apr 2023 00:23:08 +0200",
    "Message-ID": "\r\n<CAGFso0R6WbMomMx6mFFJzt_wiL8wRm3sN0YQwXz12Ugbt72XSw@mail.gmail.com>",
    Subject: "A subject with spaces",
    From: "=?UTF-8?Q?Nobody_there?= <[email protected]>",
    To: "=?UTF-8?Q?Nobody_here?= <[email protected]>",
    "Content-Type": "multipart/alternative; boundary=\"=_ohFeHkpibN03-stIVrreU01\"",
  },
  attachments: [
    {
      contentType: "text/plain; charset=utf-8; format=flowed; DelSp=Yes",
      inline: true,
      data: "Test\r\n",
      data64: "",
    },
    {
      contentType: "text/html; charset=utf-8",
      inline: true,
      data: "<div>Test</div>\r\n\r\n",
      data64: "",
    },
  ],
}

Types not exported

Hi.
I have used this library in order to parse an email in a typescript project.
I installed the library with npm install and used it in a React project.
I need to use the type ReadedEmlJson for typing the value of the state with
useState<ReadedEmlJson | undefined>,
but I noticed it is not exported.

Is it possible to fix this bug? If necessary I can open a Pull request and find a solution.
At the moment my solution was copying all the types (that are not exported at yet) in a file internal to my project, but this is only a workaround.

Thanks in adavance for your response.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.